Transferring Monolingual Model to Low-Resource Language: The Case of Tigrinya

Tela, Abrhalei; Woubie, Abraham; Hautamaki, Ville

doi:10.48550/arxiv.2006.07698

Cited by 1 publication

(1 citation statement)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…They propose a zero-shot cross-lingual transfer technique where the resultant model is a monolingual LM adapted to a new language. Tela et al 2020 study adaptation to the extremely low resourced language, Tigrinya. They find that English XLNet generalizes better than BERT and mBERT, which is surprising given that mBERT is trained in multiple languages.…”

Section: Languagementioning

confidence: 99%

On the Universality of Deep Contextual Language Models

Bhatt¹,

Goyal²,

Dandapat³

et al. 2021

Preprint

View full text Add to dashboard Cite

Deep Contextual Language Models (LMs) like ELMO, BERT, and their successors dominate the landscape of Natural Language Processing due to their ability to scale across multiple tasks rapidly by pre-training a single model, followed by task-specific fine-tuning. Furthermore, multilingual versions of such models like XLM-R and mBERT have given promising results in zero-shot cross-lingual transfer, potentially enabling NLP applications in many under-served and under-resourced languages. Due to this initial success, pre-trained models are being used as 'Universal Language Models' as the starting point across diverse tasks, domains, and languages. This work explores the notion of 'Universality' by identifying seven dimensions across which a universal model should be able to scale, that is, perform equally well or reasonably well, to be useful across diverse settings. We outline the current theoretical and empirical results that support model performance across these dimensions, along with extensions that may help address some of their current limitations. Through this survey, we lay the foundation for understanding the capabilities and limitations of massive contextual language models and help discern research gaps and directions for future work to make these LMs inclusive and fair to diverse applications, users, and linguistic phenomena.1 Throughout the rest of the paper -"these models", "LMs", "general domain LMs", "contextual LMs", "universal LMs" and all such terms refers to models including but not limited to ELMo, BERT, RoBERTa, GPT their variants, successors and multilingual versions

show abstract

Section: Languagementioning

confidence: 99%