2017 International Joint Conference on Neural Networks (IJCNN) 2017
DOI: 10.1109/ijcnn.2017.7966293
|View full text |Cite
|
Sign up to set email alerts
|

Audio event and scene recognition: A unified approach using strongly and weakly labeled data

Abstract: Abstract-In this paper we propose a novel learning framework called Supervised and Weakly Supervised Learning where the goal is to learn simultaneously from weakly and strongly labeled data. Strongly labeled data can be simply understood as fully supervised data where all labeled instances are available. In weakly supervised learning only data is weakly labeled which prevents one from directly applying supervised learning methods. Our proposed framework is motivated by the fact that a small amount of strongly … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
32
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 31 publications
(32 citation statements)
references
References 25 publications
0
32
0
Order By: Relevance
“…However, no measures to mitigate the effect of label noise were proposed. In [11], classifiers are learnt from weakly labeled web data, and to improve performance an approach is proposed using a small amount of strongly labeled audio along with the web data. In a recent audio tagging Kaggle competition using FSDKag-gle2018 [5], a number of approaches were proposed to deal with the label noise present.…”
Section: Introductionmentioning
confidence: 99%
“…However, no measures to mitigate the effect of label noise were proposed. In [11], classifiers are learnt from weakly labeled web data, and to improve performance an approach is proposed using a small amount of strongly labeled audio along with the web data. In a recent audio tagging Kaggle competition using FSDKag-gle2018 [5], a number of approaches were proposed to deal with the label noise present.…”
Section: Introductionmentioning
confidence: 99%
“…In MIL, labels are attached to a set of instances, called bag, rather than to individual instances within the bag. MIL has been widely applied, to areas such as audio event detection (AED) [25] [26] [27] and bird sound classification [28]. An attention-based CNN model [18] has also been used for ASC and has provided interpretations in a MIL framework.…”
Section: Multiple Instance Learningmentioning
confidence: 99%
“…Most recent advances in polyphonic SED are largely attributed to the use of Machine Learning and Deep Neural Networks [8,9,10,11,12,13]. In particular, the use of Convolutional Recurrent Neural Networks (CRNNs) has significantly improved SED performance in the past few years [14,15,16,17].…”
Section: Related Workmentioning
confidence: 99%