Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 2020
DOI: 10.18653/v1/2020.acl-main.688
|View full text |Cite
|
Sign up to set email alerts
|

In Neural Machine Translation, What Does Transfer Learning Transfer?

Abstract: Transfer learning improves quality for lowresource machine translation, but it is unclear what exactly it transfers. We perform several ablation studies that limit information transfer, then measure the quality impact across three language pairs to gain a black-box understanding of transfer learning. Word embeddings play an important role in transfer learning, particularly if they are properly aligned. Although transfer learning can be performed without embeddings, results are sub-optimal. In contrast, transfe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
22
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
3
2
1

Relationship

2
7

Authors

Journals

citations
Cited by 38 publications
(22 citation statements)
references
References 23 publications
0
22
0
Order By: Relevance
“…Advances in 'transfer learning' may help here (Nguyen & Chiang 2017;Aji et al 2020), as well as less supervised MT (Artetxe el al. 2018).…”
Section: Translation Of Textsmentioning
confidence: 99%
“…Advances in 'transfer learning' may help here (Nguyen & Chiang 2017;Aji et al 2020), as well as less supervised MT (Artetxe el al. 2018).…”
Section: Translation Of Textsmentioning
confidence: 99%
“…For our transfer baseline without romanization, we merge our bilingual baseline vocabulary with that of the parent model following previous work (Aji et al, 2020;Kocmi and Bojar, 2020 (Papineni et al, 2002) of the multilingual pretrained models trained on original scripts (orig), romanized with uroman and uconv.…”
Section: Vocabulary Transfermentioning
confidence: 99%
“…While transfer learning has shown great promise, the transfer between languages with different scripts brings additional challenges. For a successful transfer of the embedding layer, both the parent and the child model should use the same or a partially overlapping vocabulary (Aji et al, 2020). It is common to merge the two vocabularies by aligning identical subwords and randomly assigning the remaining subwords from the child vocabulary to positions in the parent vocabulary (Lakew et al, 2018(Lakew et al, , 2019Kocmi and Bojar, 2020).…”
Section: Introductionmentioning
confidence: 99%
“…This make the model lack of rationality and interpretability to a certain extent. For each prescription, a set of symptoms is regarded as a source sentence to be translated to the target sequence that is a group of herbs, via the RNN based Seq2Seq model, which achieves tremendous performance in the task of machine translation [37]. This model is consisting of an encoder and a decoder based on RNN which encodes the source sequence of tokens into the latent space and then iteratively decoded to each word in the target sentence.…”
Section: Introductionmentioning
confidence: 99%