2017 25th European Signal Processing Conference (EUSIPCO) 2017
DOI: 10.23919/eusipco.2017.8081711
|View full text |Cite
|
Sign up to set email alerts
|

A hybrid approach with multi-channel i-vectors and convolutional neural networks for acoustic scene classification

Abstract: Abstract-In Acoustic Scene Classification (ASC) two major approaches have been followed . While one utilizes engineered features such as mel-frequency-cepstral-coefficients (MFCCs), the other uses learned features that are the outcome of an optimization algorithm. I-vectors are the result of a modeling technique that usually takes engineered features as input. It has been shown that standard MFCCs extracted from monaural audio signals lead to i-vectors that exhibit poor performance, especially on indoor acoust… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
13
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 19 publications
(13 citation statements)
references
References 12 publications
0
13
0
Order By: Relevance
“…Similar investigations have also been reported in previous literature. [47][48][49][50][51][52] Even though these studies [53][54][55][56][57][58][59][60] demonstrate that CNNs have significant capacity to process the MFCC and capture features for classification, some demerits limit the performance. Notably, current CNN-based methods require large amounts of data to train the models extensively, and the models need to relearn their inherent parameters to incorporate classifications when encountering new input.…”
Section: Mfcc Feature Classification Using Cnn-based Methodsmentioning
confidence: 99%
“…Similar investigations have also been reported in previous literature. [47][48][49][50][51][52] Even though these studies [53][54][55][56][57][58][59][60] demonstrate that CNNs have significant capacity to process the MFCC and capture features for classification, some demerits limit the performance. Notably, current CNN-based methods require large amounts of data to train the models extensively, and the models need to relearn their inherent parameters to incorporate classifications when encountering new input.…”
Section: Mfcc Feature Classification Using Cnn-based Methodsmentioning
confidence: 99%
“…Concerning the experimented methodologies, the deep learning methods are predominant, with Feed-Forward Neural Networks and Convolutional Neural Networks (CNN) in the leading position. Many strategies are currently operating CNNs in conjunction with a variety of features, among which log-scaled mel-spectrograms [26]- [28], CNN-LTE [25] or in hybrid approaches [29]. These implementations outperformed the other attempts to approach EASR and SER tasks.…”
Section: State-of-the-artmentioning
confidence: 99%
“…A small number of systems have used spatial features extracted from binaural recordings. In order to obtain the advantages from feature engineering approaches (i-vector) and feature learning methods (CNN), the authors in [13] proposed a multichannel i-vector by computing MFCC for both channels in the audio sample. In addition, they built a CNN model similar to VGG-net (invented by the Visual Geometry Group) architecture that takes spectrogram features as the input.…”
Section: Sound Recognitionmentioning
confidence: 99%