2019
DOI: 10.1007/s10994-019-05791-5
|View full text |Cite
|
Sign up to set email alerts
|

Data scarcity, robustness and extreme multi-label classification

Abstract: The goal in extreme multi-label classification is to learn a classifier which can assign a small subset of relevant labels to an instance from an extremely large set of target labels. Datasets in extreme classification exhibit a long tail of labels which have small number of positive training instances. In this work, we pose the learning task in extreme classification with large number of tail-labels as learning in the presence of adversarial perturbations. This view motivates a robust optimization framework a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
72
0
1

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
4
2

Relationship

2
8

Authors

Journals

citations
Cited by 99 publications
(73 citation statements)
references
References 34 publications
0
72
0
1
Order By: Relevance
“…Additional evidence that classes can be viewed as long-tailed mixtures of subpopulations comes from extreme multiclass problems. Specifically, these problems often have more than 10, 000 fine-grained labels and the number of examples per class is longtailed [3,4,21,30,49,51]. Observe that fine-grained labels in such problems correspond to subcategories of coarser classes (for example, different species of birds all correspond to the "bird" label in a coarse classification problem).…”
Section: Our Contributionmentioning
confidence: 99%
“…Additional evidence that classes can be viewed as long-tailed mixtures of subpopulations comes from extreme multiclass problems. Specifically, these problems often have more than 10, 000 fine-grained labels and the number of examples per class is longtailed [3,4,21,30,49,51]. Observe that fine-grained labels in such problems correspond to subcategories of coarser classes (for example, different species of birds all correspond to the "bird" label in a coarse classification problem).…”
Section: Our Contributionmentioning
confidence: 99%
“…This algorithm can be derived as a special case (for k = 2) of the refinement step in the constrained clustering routine proposed in [Banerjee and Ghosh, 2006]. It is notable that [Prabhu et al, 2018] derive essentially the same algorithm for splitting labels into balanced cluster, but they derive their approach starting from a different graph flow-based approach to constrained clustering.…”
Section: A1 Defrag Implementation Detailsmentioning
confidence: 99%
“…One-vs-rest Sometimes also referred to as binary relavance (Zhang et al 2018), these methods learn a classifier per label which distinguishes it from rest of the labels. In terms of prediction accuracy and label diversity, these methods have been shown to be among the best performing ones for XMC (Babbar and Schölkopf 2017;Yen et al 2017;Babbar and Schölkopf 2019). However, due to their reliance on a distributed training framework, it remains challenging to employ them in resource constrained environments.…”
Section: Related Workmentioning
confidence: 99%