2018
DOI: 10.3390/sym10110605
|View full text |Cite
|
Sign up to set email alerts
|

Improvement of Speech/Music Classification for 3GPP EVS Based on LSTM

Abstract: The competition of speech recognition technology related to smartphones is now getting into full swing with the widespread internet of thing (IoT) devices. For robust speech recognition, it is necessary to detect speech signals in various acoustic environments. Speech/music classification that facilitates optimized signal processing from classification results has been extensively adapted as an essential part of various electronics applications, such as multi-rate audio codecs, automatic speech recognition, an… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
3
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 11 publications
0
3
0
Order By: Relevance
“…The LSTM layer can selectively forget and retain information through the gating mechanism, effectively solve the gradient disappearance or explosion problem of RNN, and is suitable for the classification of timeseries signals. [55,56] The Bi-LSTM model can fuse forward and backward semantic information from two LSTM layers to provide complete contextual state information, which is beneficial for sound recognition. The Bi-LSTM output was used as input to the fully connected layer and was recognized and classified by the SoftMax activation function.…”
Section: Applications In Sound Signal Collectionmentioning
confidence: 99%
“…The LSTM layer can selectively forget and retain information through the gating mechanism, effectively solve the gradient disappearance or explosion problem of RNN, and is suitable for the classification of timeseries signals. [55,56] The Bi-LSTM model can fuse forward and backward semantic information from two LSTM layers to provide complete contextual state information, which is beneficial for sound recognition. The Bi-LSTM output was used as input to the fully connected layer and was recognized and classified by the SoftMax activation function.…”
Section: Applications In Sound Signal Collectionmentioning
confidence: 99%
“…The frequently applied machine learning methods include K-Nearest Neighbor (k-NN) [1,2,3,4,5,6], Support Vector Machine (SVM) [7,8,9], Long Short-Term Memory (LSTM) [10,11,12,13,14,15], and Convolutional Neural Network (CNN) and its variants [4,16,17,18,19,20,21,22,23,24], and ensemble models [25,26,27,28,29,30,31,18]. However, a majority of research and experiments done within the field of musical instrument recognition or music classification are targeted at those belonging to western culture, mostly of European and North American origin.…”
Section: Introductionmentioning
confidence: 99%
“…This special issue of Symmetry entitled "Emerging Approaches and Advances in Big Data" consists of 17 papers [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17] that all present research in the emerging area of Big Data.…”
mentioning
confidence: 99%