FPGA-based Keyword Spotting Accelerator with Data Reuse Optimization
DOI:
https://doi.org/10.54097/4btbhd09Keywords:
FPGA, Keyword Spotting, Data Reuse, TinyML, Hardware Optimization.Abstract
Keyword Spotting (KWS), also known as wake-word detection, has become a basic component of voice systems such as smart speakers, mobile assistants and Internet of Things devices. It enables the device to continuously listen to specific commands while maintaining minimum energy consumption. With the development of neural networks, how to achieve efficient reasoning in a resource-limited environment is still a major challenge. Field Programmable Gate Array (FPGA) - a reconfigurable logic device that allows customization to be manufactured after production - provides a good balance between flexibility, power consumption and performance. This paper proposes an FPGA-based KWS accelerator optimization framework, which uses the data reuse mechanism to improve computing efficiency and reduce off-chip memory access. The system integrates the convolutional layer, activation layer and fully connected (FC) layer under a unified finite state control logic, and is modeled and simulated in Vivado 2017. The proposed row buffer-based reuse scheme minimizes redundant memory operations. Compared with the baseline design, the number of memory reads is reduced by 88.9% and the latency is reduced by 50%. The research results show that memory-centered architecture optimization can improve the reasoning performance of FPGA-based micro-machine learning (TinyML). This work provides a scalable and energy-saving framework for future intelligent edge systems.
Downloads
References
[1] Zhang S Y, Chen Y, Yang H. Tiny Speech Recognition: A Microcontroller Benchmark. Proceedings of MLSys. 2021.
[2] Chen G, Parada C, Heigold G. Small-footprint keyword spotting using deep neural networks. 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, 2014: 4087-4091.
[3] Mittal S. A survey of FPGA-based accelerators for convolutional neural networks. Neural computing and applications, 2020, 32(4): 1109-1139.
[4] Putra R V W, Hanif M A, Shafique M. ROMANet: Fine-grained reuse-driven off-chip memory access management and data organization for deep neural network accelerators. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2021, 29(4): 702-715.
[5] Liu Y, Chen R, Li S, et al. FPGA-based sparse matrix multiplication accelerators: From state-of-the-art to future opportunities. ACM Transactions on Reconfigurable Technology and Systems, 2024, 17(4): 1-37.
[6] Povey D, Ghoshal A, Boulianne G, et al. The Kaldi speech recognition toolkit. IEEE 2011 workshop on automatic speech recognition and understanding. 2011, 1: 5.1.
[7] LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521(7553): 436-444.
[8] Xilinx Inc. Vivado Design Suite User Guide: High-Level Synthesis. UG902, 2021.
[9] Reddi V J, Cheng C, Kanter D, et al. Mlperf inference benchmark. 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA). IEEE, 2020: 446-459.
[10] Wan Y, Chen J, Yang X, et al. DSA-CNN: an fpga-integrated deformable systolic array for convolutional neural network acceleration. Applied Intelligence, 2025, 55(1): 65.
[11] Du C, Yamaguchi Y. High-level synthesis design for stencil computations on FPGA with high bandwidth memory. Electronics, 2020, 9(8): 1275.
[12] Wang J, He Z, Zhao H, et al. Low-Bit Mixed-Precision Quantization and Acceleration of CNN for FPGA Deployment. IEEE Transactions on Emerging Topics in Computational Intelligence, 2024.
[13] Kokkinis A, Siozios K. Fast Resource Estimation of FPGA-Based MLP Accelerators for TinyML Applications. Electronics, 2025, 14(2): 247.
[14] Bae S, Kim H, Lee S, et al. FPGA implementation of keyword spotting system using depthwise separable binarized and ternarized neural networks. Sensors, 2023, 23(12): 5701.
[15] He K, Chen D, Su T. A configurable accelerator for keyword spotting based on small-footprint temporal efficient neural network. Electronics, 2022, 11(16): 2571.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Academic Journal of Science and Technology

This work is licensed under a Creative Commons Attribution 4.0 International License.








