2006
DOI: 10.1109/tsa.2005.855834
|View full text |Cite
|
Sign up to set email alerts
|

Robust speech recognition in noisy environments based on subband spectral centroid histograms

Abstract: We investigate how dominant-frequency information can be used in speech feature extraction to increase the robustness of automatic speech recognition against additive background noise. First, we review several earlier proposed auditory-based feature extraction methods and argue that the use of dominant-frequency information might be one of the major reasons for their improved noise robustness. Furthermore, we propose a new feature extraction method, which combines subband power information with dominant subban… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
43
0

Year Published

2009
2009
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 52 publications
(45 citation statements)
references
References 19 publications
2
43
0
Order By: Relevance
“…It can be seen clearly that as compared to others, speech signals have significantly more weight on low frequency spectrum from 300Hz to 600Hz. In order to accomplish speech detection in real time, we have implemented the SSCH (Subband Spectral Centroid Histogram) algorithm [29] on mobile devices. Specifically, SSCH passes the power spectrum of the recorded sound clip to a set of highly overlapping bandpass filters and then computes the spectral centroid 3 on each subband and finally constructs a histogram of the subband spectral centroid values.…”
Section: Real-time Background Sound Recognitionmentioning
confidence: 99%
“…It can be seen clearly that as compared to others, speech signals have significantly more weight on low frequency spectrum from 300Hz to 600Hz. In order to accomplish speech detection in real time, we have implemented the SSCH (Subband Spectral Centroid Histogram) algorithm [29] on mobile devices. Specifically, SSCH passes the power spectrum of the recorded sound clip to a set of highly overlapping bandpass filters and then computes the spectral centroid 3 on each subband and finally constructs a histogram of the subband spectral centroid values.…”
Section: Real-time Background Sound Recognitionmentioning
confidence: 99%
“…Indeed, in this case, for any fixed n, the corresponding cross section of the spectrogram |(W g f)(n, x)| 2 is but a trigonometric polynomial in terms of the frequency x. As such, the integrals in (5) can be decomposed into a large linear combination of the symbolically-evaluated integrals (10) and (11). Indeed, SCA is a contribution to the existing literature precisely because it delivers a highly accurate computation of (5) at a reasonable cost.…”
Section: The Spectral Centroid Algorithmmentioning
confidence: 99%
“…Experimentation indicates (3) is less sensitive to noise than (2), making it a popular tool in speech processing [5,11,17]. Moreover, while (2) depends on pitch alone, the spectral centroid (3) depends on both pitch and timbre, a useful property in music processing [16].…”
Section: Introductionmentioning
confidence: 99%
“…Ensemble interval histograms (EIH) are probably the most well-known auditorybased features [8]. In [9], a novel feature set called sub-band spectral centroid histograms (SSCH) integrates dominantfrequency information with sub-band power information. Another type of feature widely used in current ASR systems is perceptual linear prediction (PLP) [10].…”
Section: Introductionmentioning
confidence: 99%