2020
DOI: 10.3390/app10062020
|View full text |Cite
|
Sign up to set email alerts
|

A Review of Deep Learning Based Methods for Acoustic Scene Classification

Abstract: The number of publications on acoustic scene classification (ASC) in environmental audio recordings has constantly increased over the last few years. This was mainly stimulated by the annual Detection and Classification of Acoustic Scenes and Events (DCASE) competition with its first edition in 2013. All competitions so far involved one or multiple ASC tasks. With a focus on deep learning based ASC algorithms, this article summarizes and groups existing approaches for data preparation, i.e., feature representa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
93
0
5

Year Published

2020
2020
2021
2021

Publication Types

Select...
4
3
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 129 publications
(98 citation statements)
references
References 61 publications
0
93
0
5
Order By: Relevance
“…Audio-based risky behavior detection is based on complex features and distinguishable behaviors (e.g., coughing, sneezing, background noise), which requires a deeper CNN model than shallow architecture (i.e., two or three-layer architecture) offers [ 75 ]. VGG16 has been adopted for audio event detection and demonstrated significant literature results [ 71 ]. The feature maps were flattened to obtain the fully connected layer after the last convolutional layer.…”
Section: Resultsmentioning
confidence: 99%
“…Audio-based risky behavior detection is based on complex features and distinguishable behaviors (e.g., coughing, sneezing, background noise), which requires a deeper CNN model than shallow architecture (i.e., two or three-layer architecture) offers [ 75 ]. VGG16 has been adopted for audio event detection and demonstrated significant literature results [ 71 ]. The feature maps were flattened to obtain the fully connected layer after the last convolutional layer.…”
Section: Resultsmentioning
confidence: 99%
“…A recent trend in the field of ASC is to adopt data driven methods, wherein acoustic scene features are learned from data [ 34 ]. Among the convolutional neural network (CNN) models, the ResNet model [ 35 ] exhibits high accuracy and thus is typically used as a backbone neural network model [ 36 , 37 , 38 , 39 ].…”
Section: Related Workmentioning
confidence: 99%
“…The task of determining the source of a sound is known as Sound Event Detection (SED). Although conventional machine learning algorithms have been used for SED in the past, current state-of-the-art approaches are based on deep learning models [7,8]. Over the last years, deep learning algorithms have consistently outperformed conventional machine learning ones in the annual Detection and Classification of Acoustic Scenes and Events (DCASE) challenge [7].…”
Section: Related Workmentioning
confidence: 99%