2018
DOI: 10.1186/s13636-018-0128-6
|View full text |Cite
|
Sign up to set email alerts
|

Advanced recurrent network-based hybrid acoustic models for low resource speech recognition

Abstract: Recurrent neural networks (RNNs) have shown an ability to model temporal dependencies. However, the problem of exploding or vanishing gradients has limited their application. In recent years, long short-term memory RNNs (LSTM RNNs) have been proposed to solve this problem and have achieved excellent results. Bidirectional LSTM (BLSTM), which uses both preceding and following context, has shown particularly good performance. However, the computational requirements of BLSTM approaches are quite heavy, even when … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(7 citation statements)
references
References 35 publications
0
7
0
Order By: Relevance
“…Better expressiveness is shown in low-resource scenario on the condition that GRU has fewer parameters. Kang et al [24] proposed local BGRU with residual learning. All timedependency relationships were considered in a fixed local window.…”
Section: ) Recurrent Neural Networkmentioning
confidence: 99%
“…Better expressiveness is shown in low-resource scenario on the condition that GRU has fewer parameters. Kang et al [24] proposed local BGRU with residual learning. All timedependency relationships were considered in a fixed local window.…”
Section: ) Recurrent Neural Networkmentioning
confidence: 99%
“…[27] Speaker, Language, and Gender Identification LSTM Speech/Audio/ Music Analysis & Synthesis Garcia et al [28] Music Generation LSTM & GRU Madhok et al [29] Music Generation Stacked LSTM Song et al [30] Music tagging GRU Xie e al. [31] Speech emotion classification LSTM Kang et al [32] Speech recognition LSTM Nakyama et al [33] Audio Chord classification LSTM & GRU Chen et al [34] Voice detection GRU Human Action/Interact ion Recognition Sho et al [35] Human interaction recognition Hierarchichal LSTM Sho et al [36] Person-Person Action Recognition Concurrent LSTSM…”
Section: Role Of Gated Rnns In Various Application Domainsmentioning
confidence: 99%
“…In hybrid acoustic modeling, various deep neural network models, such as feed-forward networks (fully connected deep neural network (DNN) [1][2][3][4][5][6][7][8] and convolutional neural network (CNN) [9][10][11][12][13][14][15][16]) and deep recurrent neural networks (conventional recurrent neural network (RNN) [17], long short-term memory (LSTM) [18][19][20][21][22], and gated recurrent unit (GRU) [23][24][25][26]) are examined for both low-and high-resource-languages speech recognition tasks. These neural network models have strong and weak sides.…”
Section: Introductionmentioning
confidence: 99%
“…The limitations of DNN and CNN models can be overcome using the recurrent neural network models, which have the ability to model long-term dependencies between speech features through unfolding for long time steps. However, the conventional RNN model suffers from vanishing and exploding gradients in the stochastic gradient decent (SGD) training process [19,20,22]. These limitations of RNN model can be reduced using advanced recurrent neural network models (LSTM [12,19,22] and GRU [24,25]).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation