2022
DOI: 10.1162/coli_a_00446
|View full text |Cite
|
Sign up to set email alerts
|

Survey of Low-Resource Machine Translation

Abstract: We present a survey covering the state of the art in low-resource machine translation research. There are currently around 7000 languages spoken in the world and almost all language pairs lack significant resources for training machine translation models. There has been increasing interest in research addressing the challenge of producing useful translation models when very little translated training data is available.We present a summary of this topical research field and provide a description of the techniqu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
20
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 56 publications
(20 citation statements)
references
References 185 publications
0
20
0
Order By: Relevance
“…Ancient unknown languages unfortunately fall into this latter category, which subsequently makes the associated scripts extremely difficult to decipher. Automatic analysis of low-resource languages is an ongoing problem that has gained momentum in recent years (Haddow et al 2022;Costa-jussà et al 2022). It is our belief that the advancements in this field will have a direct correlation with the advancements in computational decipherments of ancient scripts.…”
Section: Lemmatizationmentioning
confidence: 99%
“…Ancient unknown languages unfortunately fall into this latter category, which subsequently makes the associated scripts extremely difficult to decipher. Automatic analysis of low-resource languages is an ongoing problem that has gained momentum in recent years (Haddow et al 2022;Costa-jussà et al 2022). It is our belief that the advancements in this field will have a direct correlation with the advancements in computational decipherments of ancient scripts.…”
Section: Lemmatizationmentioning
confidence: 99%
“…Ngoc Le and Sadat (2020) focus on data preprocessing, and build a morphological segmenter for the source language Inuktitut to achieve better performance in Inuktitut-English translation. These aforementioned works all adopt methods invented to tackle the task of MT on low-resource languages (Ranathunga et al, 2021;Haddow et al, 2021). We now introduce transfer learning, which is where our approach falls.…”
Section: Mt On Indigenous Languagesmentioning
confidence: 99%
“…These include iterative BT (Hoang et al, 2018), targeting difficult words (Fadaee and Monz, 2018), and tagged BT (Caswell et al, 2019). Section 3.2.1 of Haddow et al (2021) presents a comprehensive survey of BT and its variants as applied to low-resource NMT.…”
Section: Related Workmentioning
confidence: 99%