Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining 2023
DOI: 10.1145/3539597.3570392
|View full text |Cite
|
Sign up to set email alerts
|

NGAME: Negative Mining-aware Mini-batching for Extreme Classification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
22
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 9 publications
(22 citation statements)
references
References 34 publications
0
22
0
Order By: Relevance
“…MUFIN seeks to obtain an embedding xi ∈ R D for every datapoint X i and a classifier w l ∈ R D for every label l ∈ [L] so that w ⊤ l xi is indicative of the relevance of label l to datapoint i. Datapoints and labels each having multiple descriptors i.e. m i , m l ≥ 1 present opportunities to ease this process: (1) The neural architecture used to obtain datapoint embeddings xi can also be used to obtain label embeddings ẑl that can serve as a convenient warm start when learning w l and has been found to accelerate training in XC methods [5,29].…”
Section: Mufin Multimodal Extreme Classificationmentioning
confidence: 99%
See 4 more Smart Citations
“…MUFIN seeks to obtain an embedding xi ∈ R D for every datapoint X i and a classifier w l ∈ R D for every label l ∈ [L] so that w ⊤ l xi is indicative of the relevance of label l to datapoint i. Datapoints and labels each having multiple descriptors i.e. m i , m l ≥ 1 present opportunities to ease this process: (1) The neural architecture used to obtain datapoint embeddings xi can also be used to obtain label embeddings ẑl that can serve as a convenient warm start when learning w l and has been found to accelerate training in XC methods [5,29].…”
Section: Mufin Multimodal Extreme Classificationmentioning
confidence: 99%
“…Minibatches B were created over labels instead of datapoints by sampling labels randomly. This was observed to improve performance over rare labels [5,28]. Training with respect to all N datapoints for each label would have resulted in an Ω (N L) epoch complexity that is infeasible when N, L are both in the millions.…”
Section: = X1mentioning
confidence: 99%
See 3 more Smart Citations