2022
DOI: 10.48550/arxiv.2205.06993
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Improving Neural Machine Translation of Indigenous Languages with Multilingual Transfer Learning

Abstract: Machine translation (MT) involving Indigenous languages, including those possibly endangered, is challenging due to lack of sufficient parallel data. We describe an approach exploiting bilingual and multilingual pretrained MT models in a transfer learning setting to translate from Spanish to ten South American Indigenous languages. Our models set new SOTA on five out of the ten language pairs we consider, even doubling performance on one of these five pairs. Unlike previous SOTA that perform data augmentation … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 18 publications
0
1
0
Order By: Relevance
“…Ma et al (2020) achieve a better performance on the low-resource Tibetan language by training cross-lingual Chinese-Tibetan embeddings. Generally, transfer learning is a popular approach in neural machine translation when it comes to the lack of data, as described in Zoph et al (2016); Nguyen and Chiang (2017); Kocmi and Bojar (2018); Maimaiti et al (2019); Chen and Abdul-Mageed (2022). However, the cross-lingual transfer aimed at overcoming data scarcity is not limited to related languages (Adams et al, 2017;Agić et al, 2016).…”
Section: Related Workmentioning
confidence: 99%
“…Ma et al (2020) achieve a better performance on the low-resource Tibetan language by training cross-lingual Chinese-Tibetan embeddings. Generally, transfer learning is a popular approach in neural machine translation when it comes to the lack of data, as described in Zoph et al (2016); Nguyen and Chiang (2017); Kocmi and Bojar (2018); Maimaiti et al (2019); Chen and Abdul-Mageed (2022). However, the cross-lingual transfer aimed at overcoming data scarcity is not limited to related languages (Adams et al, 2017;Agić et al, 2016).…”
Section: Related Workmentioning
confidence: 99%