2016
DOI: 10.48550/arxiv.1607.02383
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Acoustic scene classification using convolutional neural network and multiple-width frequency-delta data augmentation

Yoonchang Han,
Kyogu Lee

Abstract: In recent years, neural network approaches have shown superior performance to conventional hand-made features in numerous application areas. In particular, convolutional neural networks (ConvNets) exploit spatially local correlations across input data to improve the performance of audio processing tasks, such as speech recognition, musical chord recognition, and onset detection. Here we apply ConvNet to acoustic scene classification, and show that the error rate can be further decreased by using delta features… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
11
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 11 publications
(11 citation statements)
references
References 25 publications
0
11
0
Order By: Relevance
“…In recent years, the organizers of the Detection and Classification of Acoustic Scenes and Events (DCASE) challenge [1,2,3] provided both the benchmark data and a competitive platform to promote acoustic scene research and analysis. If we are to analyze top ASC systems reported in the challenge, we will find that most of them are built on the deep neural networks (DNNs) framework, and the key ingredient of their success is the use of convolutional layers [4,5,6,7,8]. Advanced deep learning techniques, such as attention mechanism [9,10], mix-up [11,12], generative adversarial network (GAN) and variational auto encoder (VAE) based data augmentation [13,14], and deep feature learning [15,16,17] can further enhance ASC results.…”
Section: Introductionmentioning
confidence: 99%
“…In recent years, the organizers of the Detection and Classification of Acoustic Scenes and Events (DCASE) challenge [1,2,3] provided both the benchmark data and a competitive platform to promote acoustic scene research and analysis. If we are to analyze top ASC systems reported in the challenge, we will find that most of them are built on the deep neural networks (DNNs) framework, and the key ingredient of their success is the use of convolutional layers [4,5,6,7,8]. Advanced deep learning techniques, such as attention mechanism [9,10], mix-up [11,12], generative adversarial network (GAN) and variational auto encoder (VAE) based data augmentation [13,14], and deep feature learning [15,16,17] can further enhance ASC results.…”
Section: Introductionmentioning
confidence: 99%
“…In recent years, we have witnessed a great progress in the acoustic scene classification (ASC) task, as demonstrated by the high participation in the IEEE Detection and Classification of Acoustic Scenes and Events (DCASE) challenges [1,2,3]. Top ASC systems use deep neural networks (DNNs), and the main ingredient of their success is the application of deep convolutional neural networks (CNNs) [4,5,6,7,8,9]. Further boost in ASC performance is obtained with the introduction of advanced deep learning techniques, such as attention mechanism [10,11,12], mix-up [13,14], Generative Adversial Network (GAN) and Variational Auto Encoder (VAE) based data augmentation [15,16], and deep feature learning [17,18,19,20].…”
Section: Introductionmentioning
confidence: 99%
“…ASC has been an attracting research field for decades, and the IEEE Detection and Classification of Acoustic Scenes and Events (DCASE) challenge [1,2,3] provides the benchmark data and a competitive platform to promote sound scene research and analyses. In recent years, we have witnessed that the deep neural networks (DNNs) have gradually dominated the design of top ASC systems, and the main ingredient of their success is the application of deep convolutional neural networks (CNNs) [4,5,6,7]. Furthermore, with the use of advanced deep learning techniques, such as attention mechanism [8,9,10] and deep network based data augmentation [11,12,13], a further boost in ASC system performances can be obtained.…”
Section: Introductionmentioning
confidence: 99%