2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2014
DOI: 10.1109/icassp.2014.6853900
|View full text |Cite
|
Sign up to set email alerts
|

Speech feature denoising and dereverberation via deep autoencoders for noisy reverberant speech recognition

Abstract: Denoising autoencoders (DAs) have shown success in generating robust features for images, but there has been limited work in applying DAs for speech. In this paper we present a deep denoising autoencoder (DDA) framework that can produce robust speech features for noisy reverberant speech recognition. The DDA is first pre-trained as restricted Boltzmann machines (RBMs) in an unsupervised fashion. Then it is unrolled to autoencoders, and fine-tuned by corresponding clean speech features to learn a nonlinear mapp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
94
0
1

Year Published

2014
2014
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 198 publications
(96 citation statements)
references
References 14 publications
1
94
0
1
Order By: Relevance
“…Furthermore, results show additional improvement when fMLLR is applied in combination with the mapping approach. Note that in [55], a denoising autoencoder is used for mapping, and ASR improvement is reported when recognizing noisy speech. One of the reasons for this is that they initialize the deep neural network by performing pre-training using an efficient algorithm [55].…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…Furthermore, results show additional improvement when fMLLR is applied in combination with the mapping approach. Note that in [55], a denoising autoencoder is used for mapping, and ASR improvement is reported when recognizing noisy speech. One of the reasons for this is that they initialize the deep neural network by performing pre-training using an efficient algorithm [55].…”
Section: Discussionmentioning
confidence: 99%
“…Note that in [55], a denoising autoencoder is used for mapping, and ASR improvement is reported when recognizing noisy speech. One of the reasons for this is that they initialize the deep neural network by performing pre-training using an efficient algorithm [55]. In our preliminary experiments, direct feature mapping (i.e, SDM MFCC to IHM MFCC) does not yield any improvement when recognizing distant speech if we did not initialize the neural network.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…However, advances in deep learning research have led to recent breakthroughs in unsupervised audio feature extraction methods and exceptional recognition performance improvements [13,17,32]. Advances in novel machine learning algorithms, improved availability of computational resources, and the development of large databases have led to self-organization of robust audio features by efficient training of large-scale DNNs with large-scale datasets.…”
Section: Audio Feature Extraction Mechanismsmentioning
confidence: 99%
“…In [24], some novel methods for taking advantage of reverberant speech training in modern DNN-based hidden Markov model (DNN-HMM) systems were proposed. Based on multi-condition training, feature-level dereverberation by deep autoencoders (DAEs) has been investigated in [25,26]. In these works, DAEs were trained using reverberant speech features as input and clean speech features as learning targets.…”
Section: Introductionmentioning
confidence: 99%