Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Lang 2003
DOI: 10.3115/1073483.1073514
|View full text |Cite
|
Sign up to set email alerts
|

Auditory-based acoustic distinctive features and spectral cues for robust automatic speech recognition in Low-SNR car environments

Abstract: In this paper, a multi-stream paradigm is proposed to improve the performance of automatic speech recognition (ASR) systems in the presence of highly interfering car noise. It was found that combining the classical MFCCs with some auditory-based acoustic distinctive cues and the main formant frequencies of a speech signal using a multi-stream paradigm leads to an improvement in the recognition performance in noisy car environments.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0
1

Year Published

2007
2007
2020
2020

Publication Types

Select...
5
2
1

Relationship

4
4

Authors

Journals

citations
Cited by 11 publications
(7 citation statements)
references
References 4 publications
0
6
0
1
Order By: Relevance
“…The obtained results showed that the use of the formant frequencies for ASR in a multi-stream paradigm improves the ASR performance. Then in Selouani et al (2003), we extended our work to evaluate the robustness of the above mentioned proposed features using a multi-stream paradigm for ASR in noisy car environments. The obtained results showed that the use of such features renders the recognition process more robust in noisy car environments.…”
Section: Robust Front-end Processingmentioning
confidence: 99%
See 1 more Smart Citation
“…The obtained results showed that the use of the formant frequencies for ASR in a multi-stream paradigm improves the ASR performance. Then in Selouani et al (2003), we extended our work to evaluate the robustness of the above mentioned proposed features using a multi-stream paradigm for ASR in noisy car environments. The obtained results showed that the use of such features renders the recognition process more robust in noisy car environments.…”
Section: Robust Front-end Processingmentioning
confidence: 99%
“…Another important transformation of the predictor coefficients is the set of partial correlation coefficients or reflection coefficients. In previous papers (Tolba et al 2002;Selouani et al 2003), we introduced a multi-stream paradigm for ASR in which we merge different sources of information about the speech signal that could be lost when using only the MFCCs to recognize uttered speech. Our experiments in Tolba et al (2002) showed that the use of some auditory-based features and formant cues via a multi-stream paradigm approach leads to an improvement of the recognition performance.…”
Section: Robust Front-end Processingmentioning
confidence: 99%
“…Another important transformation of the predictor coefficients is the set of partial correlation coefficients or reflection coefficients. In previous papers [2][3][4], we introduced a multi-stream paradigm for ASR in which, we merge different sources of information about the speech signal that could be lost when using only the MFCCs to recognize uttered speech. Our experiments in [2] showed that the use of some auditory-based features and formant cues via a multi-stream paradigm approach leads to an improvement of the recognition performance.…”
Section: Robust Front-end Processingmentioning
confidence: 99%
“…Cependant, ils précisent qu'il n'est pas nécessaire, pour la majorité des langues, d'utiliser les douze indices. Le modèle auditif calculatoire de Caelen a permis de quantifier ces indices acoustiques, ce qui a ouvert la voie à leur incorporation dans de nombreuses configurations pratiques (Caelen, 1985;Selouani, Tolba et O'Shaughnessy, 2003). • Indice Aigu/Grave (AG) : du point de vue articulatoire, la gravité d'un phonème est générée par un volume plus important de la cavité buccale.…”
Section: Les Indices Acoustiques Auditifs Statiquesunclassified