ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
DOI: 10.1109/icassp.2019.8682847
|View full text |Cite
|
Sign up to set email alerts
|

A Comparison of Five Multiple Instance Learning Pooling Functions for Sound Event Detection with Weak Labeling

Abstract: Sound event detection (SED) entails two subtasks: recognizing what types of sound events are present in an audio stream (audio tagging), and pinpointing their onset and offset times (localization). In the popular multiple instance learning (MIL) framework for SED with weak labeling, an important component is the pooling function. This paper compares five types of pooling functions both theoretically and experimentally, with special focus on their performance of localization. Although the attention pooling func… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

3
139
3

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 147 publications
(145 citation statements)
references
References 19 publications
3
139
3
Order By: Relevance
“…Recent research investigates memory dynamics and control in recurrent neural networks (RNNs) including LSTM [22]. There are comparisons on different pooling functions for AEC/AED [9], and spatio-temporal attention pooling proposed for audio scene classification [23].…”
Section: Introductionmentioning
confidence: 99%
See 4 more Smart Citations
“…Recent research investigates memory dynamics and control in recurrent neural networks (RNNs) including LSTM [22]. There are comparisons on different pooling functions for AEC/AED [9], and spatio-temporal attention pooling proposed for audio scene classification [23].…”
Section: Introductionmentioning
confidence: 99%
“…Wang et al [9] did a thorough analysis theoretically and experimentally of five pooling functions on prediction. The analysis was done for multiple instance learning framework on AED with weak labeling, whose goal is to detect and localize events at the same time.…”
Section: Introductionmentioning
confidence: 99%
See 3 more Smart Citations