2013
DOI: 10.1016/j.csl.2012.07.002
|View full text |Cite
|
Sign up to set email alerts
|

Uncertainty-based learning of acoustic models from noisy data

Abstract: Revised version including a bugfix in the computation of the Wiener uncertainty estimator and in the corresponding numerical results in Tables 1, 2, 3, E.6, E.7 and in Figure 6 compared to the original version published by Elsevier.International audienceWe consider the problem of acoustic modeling of noisy speech data, where the uncertainty over the data is given by a Gaussian distribution. While this uncertainty has been exploited at the decoding stage via uncertainty decoding, its usage at the training stage… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
35
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
4
3

Relationship

3
4

Authors

Journals

citations
Cited by 25 publications
(35 citation statements)
references
References 27 publications
0
35
0
Order By: Relevance
“…2) Vector Taylor Series: VTS consists of linearizing the MFCC transform by its first-order Taylor series expansion [43]. The mean of the features are computed as given in the MFCC computation.…”
Section: ) Moment Matchingmentioning
confidence: 99%
See 3 more Smart Citations
“…2) Vector Taylor Series: VTS consists of linearizing the MFCC transform by its first-order Taylor series expansion [43]. The mean of the features are computed as given in the MFCC computation.…”
Section: ) Moment Matchingmentioning
confidence: 99%
“…Classification 1) Data and Algorithmic Settings: As an example classification task, we consider the noise-robust speaker identification 1-un-un 2-un-un 1-co-un 2-co-un 1-un-co 2-un-co 1-co-co 2-co-co ML [11] 1.58 benchmark in [43]. This benchmark consists of noiseless reverberated utterances and real domestic noise backgrounds from the 2nd CHiME Speech Separation and Recognition Challenge [55] which are mixed together at six different signal-to-noise ratios (SNR) from -6 to 9 dB.…”
Section: Experimental Evaluationmentioning
confidence: 99%
See 2 more Smart Citations
“…The distribution of speech distortions is typically approximated as a Gaussian from which the uncertainty or variance of speech distortions is derived. The uncertainty can be computed directly in the ASR feature domain [2,[6][7][8][9][10] or propagated from the spectral domain to the feature domain [1,[11][12][13][14][15].…”
Section: Introductionmentioning
confidence: 99%