Interspeech 2021 2021
DOI: 10.21437/interspeech.2021-684
|View full text |Cite
|
Sign up to set email alerts
|

Event Specific Attention for Polyphonic Sound Event Detection

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(4 citation statements)
references
References 0 publications
0
4
0
Order By: Relevance
“…To investigate the effectiveness of the proposed MGA-Net, we compare it with the state-of-the-art methods [13,27]. As shown in Table 1, the MGA-Net achieves 53.27%, and 56.96% EB-F1 score, 0.709 and 0.739 PSDS score for the validation and public set, respectively, significantly outperforming the compared methods.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…To investigate the effectiveness of the proposed MGA-Net, we compare it with the state-of-the-art methods [13,27]. As shown in Table 1, the MGA-Net achieves 53.27%, and 56.96% EB-F1 score, 0.709 and 0.739 PSDS score for the validation and public set, respectively, significantly outperforming the compared methods.…”
Section: Resultsmentioning
confidence: 99%
“…The global context modeling is built upon the multi-head selfattention mechanism [25]. Considering the sequential position of input features, we introduce relative positional encoding (RPE) [26] which has been shown effective in SED task [27] to encode position information of inter-frames. The length of attention weights is that of the entire time making the feature representation more global but coarser.…”
Section: Global Context Modelingmentioning
confidence: 99%
“…Some scholars indirectly quantify investor attention using metrics like news headlines (Barber and Odean 2008) and advertising (Hsu and Chen 2019;Yang et al 2021) to examine investor attention. Meanwhile, others identify changes in attention triggered by specific stock market events (Sundar et al 2021), employing event study methodology (Li and Wu 2024). These studies face a significant challenge, as indirect measurements are effective only when investors genuinely notice and read the information.…”
Section: Limited Investor Attentionmentioning
confidence: 99%
“…These neural networks are usually trained over audio features such as the Short-Time Fourier Transform (STFT), mel-spectrograms, or mel-frequency cepstral coefficients (MFCC) [16], in which a bank of filters is applied to short segments of the input audio signal, obtaining a time-frequency representation of the audio. Some recent approaches have also employed attention-based networks [17][18][19] like Conformers [20], which were originally proposed for Speech Recognition.…”
Section: Introductionmentioning
confidence: 99%