2012
DOI: 10.1587/transinf.e95.d.1229
|View full text |Cite
|
Sign up to set email alerts
|

Selective Gammatone Envelope Feature for Robust Sound Event Recognition

Abstract: SUMMARYConventional features for Automatic Speech Recognition and Sound Event Recognition such as Mel-Frequency Cepstral Coefficients (MFCCs) have been shown to perform poorly in noisy conditions. We introduce an auditory feature based on the gammatone filterbank, the Selective Gammatone Envelope Feature (SGEF), for Robust Sound Event Recognition where channel selection and the filterbank envelope is used to reduce the effect of noise for specific noise environments. In the experiments with Hidden Markov Model… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
4
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
1
1
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 22 publications
0
4
0
Order By: Relevance
“…Several feature representations for acoustic signals have been proposed over the years for capturing frequency contents and temporal structures of acoustic signals (Mitrović et al, 2010 ). The most frequently used features are the Mel-Frequency Cepstral Coefficients (MFCC) (Chu et al, 2009 ) and Gammatone Cepstral Coefficients (GTCC) (Leng et al, 2012 ). Both these features mimic the human auditory system, as they are more sensitive to changes in the low-frequency components.…”
Section: Introductionmentioning
confidence: 99%
“…Several feature representations for acoustic signals have been proposed over the years for capturing frequency contents and temporal structures of acoustic signals (Mitrović et al, 2010 ). The most frequently used features are the Mel-Frequency Cepstral Coefficients (MFCC) (Chu et al, 2009 ) and Gammatone Cepstral Coefficients (GTCC) (Leng et al, 2012 ). Both these features mimic the human auditory system, as they are more sensitive to changes in the low-frequency components.…”
Section: Introductionmentioning
confidence: 99%
“…Further to this, recent works have also demonstrated improved performance through pitch-adaptivity [135] or selection [136] of the gammatone filterbanks. For example, in [136], filterbank channel selection is performed to adapt an SER system to changing environmental conditions. This was shown to outperform both MFCC and selective Mel-filterbank features.…”
Section: Auditory Modellingmentioning
confidence: 85%
“…For non-speech audio classification, it has been shown that GTCCs can outperform both MFCCs and MPEG-7 using both kNN and SVM classifiers [86]. Further to this, recent works have also demonstrated improved performance through pitch-adaptivity [135] or selection [136] of the gammatone filterbanks. For example, in [136], filterbank channel selection is performed to adapt an SER system to changing environmental conditions.…”
Section: Auditory Modellingmentioning
confidence: 92%
“…By contrast, Gammatone filter impulse responses were obtained from measures on the basilar membrane of small mammals. Moreover, applying a Gammatone filterbank to the spectrogram has shown to be more robust against ambient noise in acoustic event monitoring compared with Mel-scale filterbank representations [20], [21]. The Gammatone filterbank has also shown good performance in automatic audio captioning systems [22] and active noise control systems [23].…”
mentioning
confidence: 99%