2015
DOI: 10.1109/taslp.2015.2450491
|View full text |Cite
|
Sign up to set email alerts
|

Coupled Dictionaries for Exemplar-Based Speech Enhancement and Automatic Speech Recognition

Abstract: Exemplar-based speech enhancement systems work by decomposing the noisy speech as a weighted sum of speech and noise exemplars stored in a dictionary, and use the resulting speech and noise estimates to obtain a time-varying filter in the full-resolution frequency domain to enhance the noisy speech. To obtain the decomposition, exemplars sampled in lower dimensional spaces are preferred over the full-resolution frequency domain for their reduced computational complexity and the ability to better generalize to … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
19
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 33 publications
(19 citation statements)
references
References 39 publications
0
19
0
Order By: Relevance
“…Using this approach, modulation spectrogram features are being presented for the exemplar-based tasks [11].…”
Section: Related Workmentioning
confidence: 99%
“…Using this approach, modulation spectrogram features are being presented for the exemplar-based tasks [11].…”
Section: Related Workmentioning
confidence: 99%
“…The earliest work on NMF based dereverberation [3] uses a convolutive NMF (referred as C-NMF) model for the reverb spectrogram. Since then many modifications to this have been proposed both in single-channel [4,5,6,7,8] and multi-channel scenario [9].…”
Section: Introductionmentioning
confidence: 99%
“…The C-NMF model for speech dereverberation was improved by additionally incorporating a NMF model for clean speech [5,6]. Various supervised approaches to handle reverberation in noisy environments have also been proposed [7,10,11]. Different regularization on RIRs in singlechannel [11,12] and multi-channel [13] scenario have been proposed leading to better speech enhancement.…”
Section: Introductionmentioning
confidence: 99%
“…In particular, speech recognition in exemplar based approaches is performed either using the atom activations of the estimated sparse feature vector [3], [4], or using the minimum reconstruction error [5] between the test exemplar and its estimate. On the contrary, in feature based approaches, either the derived sparse vector [1] or the estimate of speech is used as a feature [6] for acoustic modeling.…”
Section: Introductionmentioning
confidence: 99%
“…On the contrary, in feature based approaches, either the derived sparse vector [1] or the estimate of speech is used as a feature [6] for acoustic modeling. For computing the sparse feature vector, approaches in [3], [4] use a single overcomplete dictionary while [5] use multiple dictionaries corresponding to different speech units. A gradient descent approach is used to learn a single overcomplete dictionary using the spectro-temporal representation in [1], while mel frequency cepstral coefficients (MFCC) of training speech data (frames) are used to obtain dictionary atoms in [6] and [2].…”
Section: Introductionmentioning
confidence: 99%