2013
DOI: 10.1121/1.4807807
|View full text |Cite
|
Sign up to set email alerts
|

Narrow-band autocorrelation function features for the automatic recognition of acoustic environments

Abstract: Acoustic environments are typically composed of multiple sound sources of different typologies, making them especially complex to model and parameterize. To develop an automatic acoustic environment recognition system, this work proposes a spectro-temporal signal parameterization technique inspired by human perception. The proposed parameters are derived from the analysis of the autocorrelation function of narrow-band signals (NB-ACF) obtained from an auditory gammatone filter bank. Five features related to ac… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 9 publications
(5 citation statements)
references
References 24 publications
0
5
0
Order By: Relevance
“…Class-dependent temporal-spectral structures and long-term descriptive statistics features were extracted for sound events. Other authors applied the Discrete Gabor Transform (DGT) audio image representation [119], multiresolution feature [53], hybrid method based on mel frequency cepstral coefficient and the gammatone frequency cepstral coefficient [62], inverted MFCC and extended MFCC [66], bag of audio words (BoAW) [120], narrow band auto-correlation features (NB-ACF) [121].…”
Section: Feature Extraction Methods In Sound Classificationmentioning
confidence: 99%
“…Class-dependent temporal-spectral structures and long-term descriptive statistics features were extracted for sound events. Other authors applied the Discrete Gabor Transform (DGT) audio image representation [119], multiresolution feature [53], hybrid method based on mel frequency cepstral coefficient and the gammatone frequency cepstral coefficient [62], inverted MFCC and extended MFCC [66], bag of audio words (BoAW) [120], narrow band auto-correlation features (NB-ACF) [121].…”
Section: Feature Extraction Methods In Sound Classificationmentioning
confidence: 99%
“…These features have been shown to provide good performance for indoor and outdoor environmental sound classification. In [161], the same authors improved this technique by substituting the Mel filter bank employed to obtain the narrow-band signals by a Gammatone filter bank with Equivalent Rectangular Bandwidth bands. In addition, the Autocorrelation Zero Crossing Rate (AZCR) was added, following previous works like the one by Ghaemmaghami et al [43].…”
Section: Perceptual Autocorrelation-based Featuresmentioning
confidence: 99%
“…• Kernel Power Flow Orientation Coefficients (KPFOC): in the works by Gerazov and Ivanovski [184,185], a bank of 2D kernels is used to estimate the orientation of the power flow at every point in the auditory spectrogram calculated using a Gammatone filter bank (Valero and Alías [161]), obtaining an ASR front-end with increased robustness to both noise and room reverberation with respect to previous approaches, and specially for small vocabulary tasks.…”
Section: Wavelet-based Perceptual Featuresmentioning
confidence: 99%
“…On the other hand, the auto-correlation function (ACF) represents the time-evolution and has an intimate relationship with the power spectral density (PSD) of the underlying signal. Valero and Alias [26] proposed a new set of features called the Narrow-Band Auto Correlation Function features (NB-ACF). The extraction of NB-ACF features can be explained using Fig.…”
Section: Stationary Esr Techniquesmentioning
confidence: 99%
“…This demands a sub-frame length to be much larger than that used in sub-framing processing. It is recommended in [26] that a be sub-frame of size 500 ms with an overlap of 400 ms in a frame of 4 s. Finally, KNN and SVM classifiers are used for decision making in each sub-frame. The performance of NB-ACF features was compared with MFCC and discrete wavelet transform (DWT) coefficients with a data-set consisting of 15 environmental scenes.…”
Section: Stationary Esr Techniquesmentioning
confidence: 99%