2020
DOI: 10.48550/arxiv.2012.07335
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

LRC-BERT: Latent-representation Contrastive Knowledge Distillation for Natural Language Understanding

Abstract: The pre-training models such as BERT have achieved great results in various natural language processing problems. However, a large number of parameters need significant amounts of memory and the consumption of inference time, which makes it difficult to deploy them on edge devices. In this work, we propose a knowledge distillation method LRC-BERT based on contrastive learning to fit the output of the intermediate layer from the angular distance aspect, which is not considered by the existing distillation metho… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
references
References 34 publications
0
0
0
Order By: Relevance