2023
DOI: 10.18653/v1/2023.sigtyp-1
|View full text |Cite
|
Sign up to set email alerts
|

Proceedings of the 5th Workshop on Research in Computational Linguistic Typology and Multilingual NLP

Abstract: Multilingual models have been widely used for cross-lingual transfer to low-resource languages. However, the performance on these languages is hindered by their underrepresentation in the pretraining data. To alleviate this problem, we propose a novel multilingual training technique based on teacherstudent knowledge distillation. In this setting, we utilize monolingual teacher models optimized for their language. We use those teachers along with balanced (sub-sampled) data to distill the teachers' knowledge in… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 177 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?