2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2014
DOI: 10.1109/icassp.2014.6854478
|View full text |Cite
|
Sign up to set email alerts
|

Deep recurrent de-noising auto-encoder and blind de-reverberation for reverberated speech recognition

Abstract: This paper describes our joint efforts to provide robust automatic speech recognition (ASR) for reverberated environments, such as in hands-free human-machine interaction. We investigate blind feature space de-reverberation and deep recurrent de-noising auto-encoders (DAE) in an early fusion scheme. Results on the 2014 REVERB Challenge development set indicate that the DAE front-end provides complementary performance gains to multi-condition training, feature transformations, and model adaptation. The proposed… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
35
0
1

Year Published

2015
2015
2020
2020

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 69 publications
(36 citation statements)
references
References 22 publications
0
35
0
1
Order By: Relevance
“…Recently, Weninger et al proposed a method for combining spectral subtraction with reverberation time estimation-based dereverberation and DAE [56]. They used reverberant and dereverberant speech to train the deep recurrent denoising autoencoder.…”
Section: Discussionmentioning
confidence: 99%
“…Recently, Weninger et al proposed a method for combining spectral subtraction with reverberation time estimation-based dereverberation and DAE [56]. They used reverberant and dereverberant speech to train the deep recurrent denoising autoencoder.…”
Section: Discussionmentioning
confidence: 99%
“…Deep neural networks (DNNs) with pre-training have been shown to achieve better performance than the conventional MLP without pre-training [22]. There are many DNN, recurrent neural network (RNN) and long short-term memory (LSTM) [23][24][25][26][27][28][29] based speech enhancement and feature enhancement approaches that have been proposed for speech enhancement for human listening and robust speech recognition and that have shown good performance for the REVERB challenge task [40]. Recently, the denoising autoencoder (DAE), one type of DNN, has been shown to be effective in many noise reduction applications because higher-level representations and increased flexibility of the feature mapping function can be learned [30][31][32][33].…”
Section: Introductionmentioning
confidence: 99%
“…However, the results of DAE with small reverberation are not good compared to other methods. Typically, in the training of a DAE [26,34,35], data incorporating various environmental conditions are used.…”
Section: Introductionmentioning
confidence: 99%
“…Acoustic models were retrained using the reconstructed features. Weninger et al [27,28] have shown that deep recurrent neural networks (RNNs) are also suitable for feature enhancement of reverberant speech signals. Recently, Mimura et al [29] augmented the input of the autoencoder based on long shortterm memory (LSTM) [30] with phone-class information (denoted as pLSTM).…”
Section: Introductionmentioning
confidence: 99%