2020
DOI: 10.1007/978-3-030-58558-7_15
|View full text |Cite
|
Sign up to set email alerts
|

Learning From Multiple Experts: Self-paced Knowledge Distillation for Long-Tailed Classification

Abstract: In real-world scenarios, data tends to exhibit a long-tailed distribution, which increases the difficulty of training deep networks. In this paper, we propose a novel self-paced knowledge distillation framework, termed Learning From Multiple Experts (LFME). Our method is inspired by the observation that networks trained on less imbalanced subsets of the distribution often yield better performances than their jointly-trained counterparts. We refer to these models as 'Experts', and the proposed LFME framework ag… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
127
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 206 publications
(127 citation statements)
references
References 48 publications
0
127
0
Order By: Relevance
“…Except for distilling with model compression, knowledge distillation is also proved to be effective when the teacher and the student have identical architectures, i.e., self-distillation [ 37 , 38 ], which transfers the knowledge between the same model structures. Knowledge distillation has also been applied in other areas such as semisupervised learning [ 15 ], curriculum learning [ 39 ], and neural style transfer [ 40 ].…”
Section: Related Workmentioning
confidence: 99%
“…Except for distilling with model compression, knowledge distillation is also proved to be effective when the teacher and the student have identical architectures, i.e., self-distillation [ 37 , 38 ], which transfers the knowledge between the same model structures. Knowledge distillation has also been applied in other areas such as semisupervised learning [ 15 ], curriculum learning [ 39 ], and neural style transfer [ 40 ].…”
Section: Related Workmentioning
confidence: 99%
“…According to Sweeney and Najafian (2020), the more imbalanced/skewed a prediction produced by a trained model is, the more unfair opportunities it gives over predefined categories, the more unfairly-discriminative the trained model is. We thus follow previous work (Xiang and Ding, 2020;Sweeney and Najafian, 2020) to use the metricimbalance divergence -to evaluate whether a prediction (normally a probability distribution) P is imbalanced/skewed/unfair:…”
Section: Bias Analysismentioning
confidence: 99%
“…In contrast to two-phase distillation approaches, on-the-fly Native Ensemble [54] is proposed, which encounters training procedure as a one-stage online distillation. In 2020, Xiang et al [55] proposed a novel framework called Learning From Multiple Experts. In the definitions proposed by authors, 'Experts' refer to the models which extract features on less imbalanced data distribution.…”
Section: B Multi-teacher Kdmentioning
confidence: 99%