2018
DOI: 10.1109/jas.2018.7511066
|View full text |Cite
|
Sign up to set email alerts
|

Deep Scalogram Representations for Acoustic Scene Classification

Abstract: Spectrogram representations of acoustic scenes have achieved competitive performance for acoustic scene classification. Yet, the spectrogram alone does not take into account a substantial amount of time-frequency information. In this study, we present an approach for exploring the benefits of deep scalogram representations, extracted in segments from an audio stream. The approach presented firstly transforms the segmented acoustic scenes into bump and morse scalograms, as well as spectrograms; secondly, the sp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
46
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 106 publications
(46 citation statements)
references
References 35 publications
0
46
0
Order By: Relevance
“…In our future work, other newly developed models, e.g. [24] and [29], will be considered to explore the better recognation method in the field of ESC.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…In our future work, other newly developed models, e.g. [24] and [29], will be considered to explore the better recognation method in the field of ESC.…”
Section: Discussionmentioning
confidence: 99%
“…The deep learning models represented by the CNN and Long Short-Term Memory (LSTM) have been widely used in the field of audio processing [22]- [24]. However, we only chose the CNN to construct the basic model of ESC, because the CNN has a many advantages over LSTM in ESC tasks.…”
Section: Methodsmentioning
confidence: 99%
“…A more ideal concept would be to use the raw pulses registered by the sensor, as some authors have done, using a single pulse waveform or several consecutives pulses recorded over time, or transforming the pulses into a spectrogram image. Other representations could also be investigated to avoid limitations imposed by the STFT in the spectrogram representation, as the Local Polynomial Fourier Transform [32] or a scalogram [33].…”
Section: Discussionmentioning
confidence: 99%
“…For example, complex spectrograms have been used to train a speech-enhancement CNN model, which denoises the input signals [14]. In the field of acoustic scene classification, a DL approach based on scaleograms and spectrograms paired with a pretrained CNN and a generalized regression neural network has been proposed in [30], providing excellent results in the DCASE 2017 challenge [28]. Besides that, DL methods have been applied to speech emotion recognition problems.…”
Section: Related Workmentioning
confidence: 99%