2017
DOI: 10.1016/j.patrec.2017.01.013
|View full text |Cite
|
Sign up to set email alerts
|

Combining visual and acoustic features for audio classification tasks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
63
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 74 publications
(63 citation statements)
references
References 14 publications
0
63
0
Order By: Relevance
“…The spectrogram has rich sound features [12][13][14]. To enhance the system capabilities, we considered introducing CNN [15], which deals with images.…”
Section: Cnnmentioning
confidence: 99%
See 1 more Smart Citation
“…The spectrogram has rich sound features [12][13][14]. To enhance the system capabilities, we considered introducing CNN [15], which deals with images.…”
Section: Cnnmentioning
confidence: 99%
“…As this study focuses on sound analysis, we investigated recent studies on sound signal processing using AI techniques, and then discovered several studies on audio feature analysis using a spectrogram. A spectrogram image obtained by short-time Fourier transform contains rich information regarding sound characteristics [12][13][14]; for this, a convolution neural network (CNN) [15] is employed to classify the sound from the inputted spectrogram. Justin and Juan [16] addressed the classification of an environmental sound using CNN, into which a spectrogram-like image (mel-spectrogram) is inputted.…”
Section: Introductionmentioning
confidence: 99%
“…Most ML approaches in animal call classification take their lead from automated speech recognition by virtue of the commonalities between human speech and birdcalls. These ML approaches include supervised neural networks (including deep learning neural networks) [17]- [21], unsupervised neural networks [22], support vector machines [23]- [25], decision trees [26], [27], random forests [28], [29], and hidden markov model [30]- [34]. Despite the significant amount of research into the automated classification of birdcalls, there is not yet an adequate method for field recordings due to the challenges associated with birdcall classification, such as the high variability in calls.…”
Section: A Birdcalls In Acoustic Recordingmentioning
confidence: 99%
“…Sound classification and recognition has been included among the pattern recognition tasks for different application domains, e.g. speech recognition [1], music classification [2], environmental sound recognition or biometric identification [3]. In the traditional pattern recognition framework (preprocessing, feature extraction and classification) features have generally been extracted from the actual audio traces (e.g.…”
Section: Introductionmentioning
confidence: 99%