2020
DOI: 10.1609/aaai.v34i01.5341
|View full text |Cite
|
Sign up to set email alerts
|

Cross-Lingual Pre-Training Based Transfer for Zero-Shot Neural Machine Translation

Abstract: Transfer learning between different language pairs has shown its effectiveness for Neural Machine Translation (NMT) in low-resource scenario. However, existing transfer methods involving a common target language are far from success in the extreme scenario of zero-shot translation, due to the language space mismatch problem between transferor (the parent model) and transferee (the child model) on the source side. To address this challenge, we propose an effective transfer learning approach based on cross-lingu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
47
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 47 publications
(48 citation statements)
references
References 17 publications
1
47
0
Order By: Relevance
“…Different loss functions such as cosine distance [4], Euclidean distance [113], and correlation distance [124] have been shown to be beneficial in reducing the source/pivot divergence. Ji et al [69] proposed to use pre-trained cross-lingual encoders trained using multilingual MLM, XLM, and BRLM objectives to obtain language-invariant encoder representations. Sen et al [129] used denoising autoencoding and back-translation to obtain languageinvariant encoder representations.…”
Section: :19mentioning
confidence: 99%
“…Different loss functions such as cosine distance [4], Euclidean distance [113], and correlation distance [124] have been shown to be beneficial in reducing the source/pivot divergence. Ji et al [69] proposed to use pre-trained cross-lingual encoders trained using multilingual MLM, XLM, and BRLM objectives to obtain language-invariant encoder representations. Sen et al [129] used denoising autoencoding and back-translation to obtain languageinvariant encoder representations.…”
Section: :19mentioning
confidence: 99%
“…While almost all encoders are pretrained without explicit crosslingual objective, i.e. enforcing similar words from different languages have similar representation, improvements can be attained through the use of explicit cross-lingually linked data during pretraining, such as bitexts (Conneau and Lample, 2019;Huang et al, 2019;Ji et al, 2019) and dictionaries . As with cross-lingual embeddings (Ruder et al, 2019), these data can be used to support explicit alignment objectives with either linear mappings (Wang et al, 2019(Wang et al, , 2020 or fine-tuning (Cao et al, 2020).…”
Section: Introductionmentioning
confidence: 99%
“…Many papers discuss NMT, emphasizing zero shot neural machine [ 31 , 32 , 33 ] techniques. The authors of [ 34 ] note that NMT requires smaller data sizes, as small as a few thousand training sentences. In [ 35 ], an extensive survey for low resource NMT is introduced.…”
Section: Background and Literature Surveymentioning
confidence: 99%