1996
DOI: 10.1109/97.489062
|View full text |Cite
|
Sign up to set email alerts
|

Signal conditioning techniques for robust speech recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
29
0

Year Published

1997
1997
2023
2023

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 49 publications
(29 citation statements)
references
References 4 publications
0
29
0
Order By: Relevance
“…Given that each x i can be understood as a varying excitation convolved by a channel and the cepstrum transforms a convolutional mix into an addition, the mean value will be the unvarying part, which corresponds to the channel. This idea has been used in speech processing to determine the spectral characteristics of channels [28].…”
Section: ) Cepstral Analysismentioning
confidence: 99%
“…Given that each x i can be understood as a varying excitation convolved by a channel and the cepstrum transforms a convolutional mix into an addition, the mean value will be the unvarying part, which corresponds to the channel. This idea has been used in speech processing to determine the spectral characteristics of channels [28].…”
Section: ) Cepstral Analysismentioning
confidence: 99%
“…By repeating the above EM iteration, we can get a series of approximate pdf whose mode is approaching to the mode 1 of the true posterior pdf (31) Thus, the hyperparameters are obtained at the last (actually th) EM iteration by using (13)- (19) to satisfy (32) and the CDHMM parameters are updated accordingly.…”
Section: M-stepmentioning
confidence: 99%
“…1, many existing techniques can be applied. They include, for example, the popular cepstral mean subtraction algorithm [2], different cepstral normalization methods (e.g., CDCN) discussed in [1], ML-based feature space stochastic matching methods [7], [41], [33], signal conditioning techniques [30], [31], etc. Acoustic normalization could even be integrated into the feature extraction stage, e.g., speaker normalization via vocal tract length normalization using frequency warping [39], [11], [22].…”
mentioning
confidence: 99%
“…In Appendix C, we show that the gradients of are given by (27) (28) (29) 1 If the number of classes, Y, is very large, one can restrict the sum over y in (22) to the N-best alternatives, where N Y.…”
Section: B Multivariate Gaussian Pdf'smentioning
confidence: 99%
“…In particular, for the th data point, let (22) denote the log-likelihood-ratio between the softmax-smoothed competing hypothesis 1 and the correct one. The function provides a smooth measure of the misclassification of .…”
Section: A Discriminative Trainingmentioning
confidence: 99%