Shunzhi Yang scite author profile

Driven by the vision of Internet of Things, some research efforts have already focused on designing a network of efficient speech recognition for the development of edge computing. Other researches (such as tpool2) do not make full use of spatial and temporal information in the acoustic features of speech. In this paper, we propose a compact speech recognition network with spatio-temporal features for edge computing, named EdgeRNN. Alternatively, EdgeRNN uses 1-Dimensional Convolutional Neural Network (1-D CNN) to process the overall spatial information of each frequency domain of the acoustic features. A Recurrent Neural Network (RNN) is used to process the temporal information of each frequency domain of the acoustic features. In addition, we propose a simplified attention mechanism to enhance the portion of the network that contributes to the final identification. The overall performance of EdgeRNN has been verified on speech emotion and keywords recognition. The IEMOCAP dataset is used in speech emotion recognition, and the unweighted average recall (UAR) reaches 63.98%. Speech keywords recognition uses Google's Speech Commands Datasets V1 with a weighted average recall (WAR) of 96.82%. Compared with the experimental results of the related efficient networks on Raspberry Pi 3B+, the accuracies of EdgeRNN have been improved on both of speech emotion and keywords recognition.

show abstract

Feature Map Distillation of Thin Nets for Low-Resolution Object Recognition

Huang

Yang

Zhou

et al. 2022

IEEE Trans. on Image Process.

View full text Add to dashboard Cite

Making accurate object detection at the edge: review and new approach

et al. 2021

View full text Add to dashboard Cite

EdgeCRNN: an edge-computing oriented model of acoustic feature enhancement for keyword spotting

Wei

Gong

Yang

et al. 2021

J Ambient Intell Human Comput

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Shunzhi Yang

EdgeRNN: A Compact Speech Recognition Network With Spatio-Temporal Features for Edge Computing

Feature Map Distillation of Thin Nets for Low-Resolution Object Recognition

Making accurate object detection at the edge: review and new approach

EdgeCRNN: an edge-computing oriented model of acoustic feature enhancement for keyword spotting

Contact Info

Product

Resources

About