2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07 2007
DOI: 10.1109/icassp.2007.367246
|View full text |Cite
|
Sign up to set email alerts
|

Normalizing the Speech Modulation Spectrum for Robust Speech Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
8
0

Year Published

2007
2007
2025
2025

Publication Types

Select...
4
3
1

Relationship

2
6

Authors

Journals

citations
Cited by 14 publications
(8 citation statements)
references
References 43 publications
0
8
0
Order By: Relevance
“…In this case, there are only K free parameters. It is reported in [61] that the ensemble modeling approach outperforms MLLR significantly on Aurora-2 task. Two extensions of ensemble modeling are reported in [79].…”
Section: Ensemble Modelingmentioning
confidence: 99%
See 1 more Smart Citation
“…In this case, there are only K free parameters. It is reported in [61] that the ensemble modeling approach outperforms MLLR significantly on Aurora-2 task. Two extensions of ensemble modeling are reported in [79].…”
Section: Ensemble Modelingmentioning
confidence: 99%
“…Hence, it is coarse to normalize the histogram of just one utterance to the global histogram of the entire training set which usually consists of thousands of utterances. There is another simple example to demonstrate the drawbacks of CMN, CVN, and HEQ [61]. Suppose there is an utterance with several words.…”
Section: Histogram Normalizationmentioning
confidence: 99%
“…The temporal filters are another group of techniques to reduce the mismatch by filtering the feature trajectories [4][5][6][7][8][9][10][11][12]. These temporal filters are design from different perspectives, e.g., the RelAtive SpecTrA (RASTA) filter [4] and the MVA filter [5] are empirically designed to remove the very low and/or high modulation frequencies that are believed to be less relevant to speech intelligibility but prone to environmental distortions.…”
Section: Introductionmentioning
confidence: 99%
“…The data-driven filters [6][7][8] are designed from the data to better represent the features or improve the features' discriminative ability. The temporal structure normalization (TSN) filter [9][10][11] and the temporal smoothing (TES) filter [12] are designed to normalize the inter-frame correlation of the feature. Although designed from very different criteria, the resulting temporal filters are usually band-pass or low-pass.…”
Section: Introductionmentioning
confidence: 99%
“…Compared to other temporal filters [27][28][29][30][31][32][33][34], an advantage of the TSN filter is that it is able to automatically adapt to environment distortions in the speech signal and provides customized filtering. The work to be presented in this chapter has been published in [128][129][130][131].…”
mentioning
confidence: 99%