2019
DOI: 10.48550/arxiv.1911.02116
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Unsupervised Cross-lingual Representation Learning at Scale

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

4
530
0
4

Year Published

2021
2021
2022
2022

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 356 publications
(538 citation statements)
references
References 0 publications
4
530
0
4
Order By: Relevance
“…As examples of neural CLIR models, we evaluated vanilla reranking models [26] fine-tuned with MS-MARCO-v1 [2] for at most one epoch with various multi-language pretrained models, including multilingual-BERT (mBERT) [13], XLM-Roberta-large (XLM-R) [8], and infoXLM-large [6]. Model checkpoints were selected by nDCG@100 on HC4 dev sets.…”
Section: Baseline Runsmentioning
confidence: 99%
“…As examples of neural CLIR models, we evaluated vanilla reranking models [26] fine-tuned with MS-MARCO-v1 [2] for at most one epoch with various multi-language pretrained models, including multilingual-BERT (mBERT) [13], XLM-Roberta-large (XLM-R) [8], and infoXLM-large [6]. Model checkpoints were selected by nDCG@100 on HC4 dev sets.…”
Section: Baseline Runsmentioning
confidence: 99%
“…A plethora of architectures have been proposed implementing the attention-based mechanism since it was proposed. Models such as BERT [7], Roberta [8], XML [9] or XLM-RoBERTa [10] are being used in a large number of NLP tasks with great success.…”
Section: The Transformer Architecturementioning
confidence: 99%
“…• Multilingual BERT (mBERT) is a BERT [41] model pretrained on Wikipedia data having over 100 languages with a masked language modeling objective. • XLM-Roberta (XLM-R) [42] is a transformer-based masked language model pretrained on Common Crawl data having about 100 languages. It was proposed by Facebook and happened to be one of the best-performing transformer models for multilingual tasks.…”
Section: Baseline Approach : Fine-tuning Transformer Modelsmentioning
confidence: 99%