2008
DOI: 10.1109/icassp.2008.4518714
|View full text |Cite
|
Sign up to set email alerts
|

Localized spectro-temporal cepstral analysis of speech

Abstract: Drawing on recent progress in auditory neuroscience, we present a novel speech feature analysis technique based on localized spectrotemporal cepstral analysis of speech. We proceed by extracting localized 2D patches from the spectrogram and project onto a 2D discrete cosine (2D-DCT) basis. For each time frame, a speech feature vector is then formed by concatenating low-order 2D-DCT coefficients from the set of corresponding patches. We argue that our framework has significant advantages over standard onedimens… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

3
40
1

Year Published

2010
2010
2022
2022

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 43 publications
(44 citation statements)
references
References 10 publications
3
40
1
Order By: Relevance
“…The class of spectro-temporal modulation (STM) features have been recently evaluated for speech applications including speech/non-speech discrimination [11], phonetic recognition [12], keyword spotting [13], and word recognition [14]. However, in each case a frame-based strategy was employed, which ultimately relied on dimensionality reduction to improve computational tractability.…”
Section: Introductionmentioning
confidence: 99%
“…The class of spectro-temporal modulation (STM) features have been recently evaluated for speech applications including speech/non-speech discrimination [11], phonetic recognition [12], keyword spotting [13], and word recognition [14]. However, in each case a frame-based strategy was employed, which ultimately relied on dimensionality reduction to improve computational tractability.…”
Section: Introductionmentioning
confidence: 99%
“…With this step we smooth out the unnecessary fine details from the spectrum and reduce feature dimensionality at the same time. Bouvrie et al proposed keeping only the 6 lowest-order 2D-DCT coefficients corresponding to the lower left 3x3 triangle of the coefficient matrix [9], while here we are going to keep 9 coefficients. The right hand side of Fig.…”
Section: Localized Spectro-temporal Featuresmentioning
confidence: 99%
“…These findings motivate those feature extraction methods that process the spectro-temporal representation in patches which are localized in both time and frequency. The most popular among these methods is to analyze these localized timefrequency patches by means of Gabor filters [8], but here we are going to follow the study of Bouvrie et al [9], and apply two-dimensional DCT to process them. Fig.…”
Section: Localized Spectro-temporal Featuresmentioning
confidence: 99%
“…Dynamic information is subsequently added by appending approximate temporal derivatives of the cepstral features. Other features such as TRAPS/HATS [1], frequency domain linear prediction features [2], multiresolution RASTA features [3], 2D-DCT localized features [4] extract information directly from the spectro-temporal plane.…”
Section: Introductionmentioning
confidence: 99%