2021
DOI: 10.1007/978-3-030-69143-1_28
|View full text |Cite
|
Sign up to set email alerts
|

Enhanced Back-Translation for Low Resource Neural Machine Translation Using Self-training

Abstract: Many language pairs are low resourcethe amount and/or quality of parallel data is not sufficient to train a neural machine translation (NMT) model which can reach an acceptable standard of accuracy. Many works have explored the use of the easier-to-get monolingual data to improve the performance of translation models in this category of languagesand even high resource languages. The most successful of such works is the back-translationusing the translations of the target language monolingual data to increase t… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 11 publications
(5 citation statements)
references
References 32 publications
0
5
0
Order By: Relevance
“…Hausa is a low-resource language, there are limited sources of domain-specific parallel data between it and English [44], [45], [46], [47]. For the news domain we use the Khamenei parallel corpus derived from news articles and speeches on khamenei.ir, which can now only be accessed by Wayback Machine.…”
Section: A Datamentioning
confidence: 99%
“…Hausa is a low-resource language, there are limited sources of domain-specific parallel data between it and English [44], [45], [46], [47]. For the news domain we use the Khamenei parallel corpus derived from news articles and speeches on khamenei.ir, which can now only be accessed by Wayback Machine.…”
Section: A Datamentioning
confidence: 99%
“…The calculation of the probability of a class of synonyms by the formula (3) will be as follows: x shows the biggest (maximum) [11][12][13][14][15].…”
Section: S Ementioning
confidence: 99%
“…Zhang et al [14] proposed using self-learning algorithms to generate pseudo-parallel corpora from monolingual data, and also using two NMT models in a multitasking learning framework to predict translations and reorder source sentences. Abdulsumin et al [15] proposed an improvement on self-learning methods, namely iterative self training methods. Hoang et al [16] improved the performance of NMT by iterating reverse translation in high-and low-resource situations.…”
Section: Introductionmentioning
confidence: 99%