Proceedings of the Web Conference 2021 2021
DOI: 10.1145/3442381.3450139
|View full text |Cite
|
Sign up to set email alerts
|

Convex Surrogates for Unbiased Loss Functions in Extreme Classification With Missing Labels

Abstract: Extreme Classification (XC) refers to supervised learning where each training/test instance is labeled with small subset of relevant labels that are chosen from a large set of possible target labels. The framework of XC has been widely employed in web applications such as automatic labeling of web-encyclopedia, prediction of related searches, and recommendation systems.While most state-of-the-art models in XC achieve high overall accuracy by performing well on the frequently occurring labels, they perform poor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3

Relationship

3
4

Authors

Journals

citations
Cited by 15 publications
(7 citation statements)
references
References 25 publications
0
7
0
Order By: Relevance
“…In this section, we empirically show the usefulness of the proposed plug-in approach by incorporating it into BR and PLT algorithms and comparing these algorithms to their vanilla versions and stateof-the-art methods, particularly those that focus on tail-labels performance: PFastreXML [11], ProXML [4], a variant of DiSMEC [3] with a re-balanced and unbiased loss function as implemented in PW-DiSMEC [20] (class-balanced variant), and Parabel [18]. We conduct a comparison on six well-established XMLC benchmark datasets from the XMLC repository [6], for which we use the original train and test splits.…”
Section: Resultsmentioning
confidence: 99%
“…In this section, we empirically show the usefulness of the proposed plug-in approach by incorporating it into BR and PLT algorithms and comparing these algorithms to their vanilla versions and stateof-the-art methods, particularly those that focus on tail-labels performance: PFastreXML [11], ProXML [4], a variant of DiSMEC [3] with a re-balanced and unbiased loss function as implemented in PW-DiSMEC [20] (class-balanced variant), and Parabel [18]. We conduct a comparison on six well-established XMLC benchmark datasets from the XMLC repository [6], for which we use the original train and test splits.…”
Section: Resultsmentioning
confidence: 99%
“…If propensities are known, then they can be used to construct an unbiased, task or surrogate, loss l [35] in the sense that The construction of the unbiased counterpart depends on the form of propensities, e.g., the label-wise propensities (7) are sufficient for losses decomposable over labels [24] like Hamming loss or binary cross-entropy, but might not be for more complex losses without additional assumptions [30]. The unbiased losses can be used in training procedures [18,26] or for estimating the performance of classifiers. For some losses, such as Hamming loss or precision@𝑘, the Bayes classifier can be written as a function of the conditional label distributions 𝜂 𝑗 (𝑥).…”
Section: Missing Labelsmentioning
confidence: 99%
“…For example, decision tree methods can directly use the propensity-scored variants of metrics such as precision@𝑘 or nDCG@𝑘 [18]. Alternatively, one can use unbiased or upper-bounded propensity-scored surrogate losses [26].…”
Section: Empirical Propensity Modelmentioning
confidence: 99%
See 1 more Smart Citation
“…for the large dataset from the extreme classification repository [5] (cf. Figures 1 in [31,4,28] for some examples), and for many types of data that is gathered at internet-scale [1].…”
Section: Introductionmentioning
confidence: 99%