2021
DOI: 10.48550/arxiv.2104.09732
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Knowledge Distillation as Semiparametric Inference

Tri Dao,
Govinda M Kamath,
Vasilis Syrgkanis
et al.

Abstract: A popular approach to model compression is to train an inexpensive student model to mimic the class probabilities of a highly accurate but cumbersome teacher model. Surprisingly, this two-step knowledge distillation process often leads to higher accuracy than training the student directly on labeled data. To explain and enhance this phenomenon, we cast knowledge distillation as a semiparametric inference problem with the optimal student model as the target, the unknown Bayes class probabilities as nuisance, an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 34 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?