2007
DOI: 10.1007/s10772-009-9025-9
|View full text |Cite
|
Sign up to set email alerts
|

A noise-robust front-end for distributed speech recognition in mobile communications

Abstract: This paper investigates a new front-end processing that aims at improving the performance of speech recognition in noisy mobile environments. This approach combines features based on conventional Mel-cepstral Coefficients (MFCCs), Line Spectral Frequencies (LSFs) and formant-like (FL) features to constitute robust multivariate feature vectors. The resulting front-end constitutes an alternative to the DSR-XAFE (XAFE: eXtended Audio FrontEnd) available in GSM mobile communications. Our results showed that for hi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
5
0

Year Published

2011
2011
2022
2022

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 9 publications
(7 citation statements)
references
References 8 publications
0
5
0
Order By: Relevance
“…In order to improve the noise robustness of the DSR front-end; one must combine the MFCCs with features that are robust against noise. PLP features [4], RASTA features [5], and spectral peaks, also known as formant-like features [1], are some of the features that are known to be robust against additive noise. Choosing the best feature set highly depends on the application and constraints.…”
Section: Overview Of Distributed Speech Recognitionmentioning
confidence: 99%
See 2 more Smart Citations
“…In order to improve the noise robustness of the DSR front-end; one must combine the MFCCs with features that are robust against noise. PLP features [4], RASTA features [5], and spectral peaks, also known as formant-like features [1], are some of the features that are known to be robust against additive noise. Choosing the best feature set highly depends on the application and constraints.…”
Section: Overview Of Distributed Speech Recognitionmentioning
confidence: 99%
“…In previous work [1,13], we introduced a multi-stream paradigm for DSR in which, we merge different sources of information about the speech signal that could be lost when using only the MFCCs to recognize uttered speech. Our experiments showed that such multi-variable, integrating some parameters based on a model simulating the cochlea and the acoustic cues reflecting the spectral resonances (formants), leads to an improved recognition rate.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…In our previous work [14, 15], we investigated the potential of a multi‐stream FE using the formant frequencies and line spectral frequencies (LSFs) features combined with conventional MFCCs to improve the performance of a DSR basic (FE) system in severely degraded environments. Formant and LSF features are more suitable for our application because extracting them can be done as part of the mel‐frequency cepstral coefficients (MFCC) extraction process, saving a lot of computation.…”
Section: Introductionmentioning
confidence: 99%
“…The latter is then used to train multi-stream hidden Markov models (HMMs). The reason that motivated us to consider LSFs is related to the fact that LSF regions of the spectrum may stay above the noise level even in very low SNR ratio, while the lower energy regions will tend to be masked by the noise energy [8]. Moreover, the extraction LSF parameters can be done as part of the process of MFCCs calculation which allows reducing significantly the added computation load.…”
Section: Introductionmentioning
confidence: 99%