2021
DOI: 10.1186/s13636-021-00206-7
|View full text |Cite
|
Sign up to set email alerts
|

Frequency-dependent auto-pooling function for weakly supervised sound event detection

Abstract: Sound event detection (SED), which is typically treated as a supervised problem, aims at detecting types of sound events and corresponding temporal information. It requires to estimate onset and offset annotations for sound events at each frame. Many available sound event datasets only contain audio tags without precise temporal information. This type of dataset is therefore classified as weakly labeled dataset. In this paper, we propose a novel source separation-based method trained on weakly labeled data to … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 25 publications
0
1
0
Order By: Relevance
“…The authors introduce connectionist temporal classification (CTC) to calculate the loss. At the same time, the adaptive pooling operators are shown to offer better performance on the weakly labeled SED task compared with commonly used pooling operators, such as max-, or average-pooling [23][24][25]. Although these methods have obtained promising results, they have not fully addressed the problems such as overfitting.…”
Section: Introductionmentioning
confidence: 99%
“…The authors introduce connectionist temporal classification (CTC) to calculate the loss. At the same time, the adaptive pooling operators are shown to offer better performance on the weakly labeled SED task compared with commonly used pooling operators, such as max-, or average-pooling [23][24][25]. Although these methods have obtained promising results, they have not fully addressed the problems such as overfitting.…”
Section: Introductionmentioning
confidence: 99%