2022
DOI: 10.1016/j.dsp.2022.103434
|View full text |Cite
|
Sign up to set email alerts
|

A capsule network with pixel-based attention and BGRU for sound event detection

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(3 citation statements)
references
References 14 publications
0
3
0
Order By: Relevance
“…In industrial applications, the spectrogram has been used for fault detection using vibration signals [64], fault detection in gearboxes based on sound [65], and fault detection in rotary systems using data from various sensors [66]. Verification of bird diversity [67] and the detection of shoots in forests [68] using ambient sound, seismo-acoustic event prediction using vibration signals and ground waves [69], and an AED system [70] are other state-of-the-art applications that use the spectrogram as a feature extraction method.…”
Section: Literature Reviewmentioning
confidence: 99%
“…In industrial applications, the spectrogram has been used for fault detection using vibration signals [64], fault detection in gearboxes based on sound [65], and fault detection in rotary systems using data from various sensors [66]. Verification of bird diversity [67] and the detection of shoots in forests [68] using ambient sound, seismo-acoustic event prediction using vibration signals and ground waves [69], and an AED system [70] are other state-of-the-art applications that use the spectrogram as a feature extraction method.…”
Section: Literature Reviewmentioning
confidence: 99%
“…The approaches usually proposed in this area are based on multiclass classifiers because SED is considered a multiclass classification problem. In the field of feature extraction, the most commonly used features have been Mel-based features such as Log-Mel [5,6], Log-Mel Power Spectrograms (LMS) [7,8] and Mel Frequency Cepstral Coefficients (MFCC) [9][10][11]. In addition to MFCC, features such as linear predictive coding [12], discrete cosine transforms [13,14], wavelet [9,15], Perceptual Linear Prediction (PLP) [16], Linear Prediction Cepstral Coefficients (LPCC) [17], and Line Spectral Frequencies (LSF) [18] have been used in various studies for SED.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Meng et al [5] used a bidirectional gated recurrent unit (BGRU) as an RNN for sound event detection. Politis et al [65] analyzed the classifiers used in Sound Event Localization and Detection in the DCASE 2019 Challenge and concluded that most were CRNN.…”
Section: Literature Reviewmentioning
confidence: 99%