Rapid Adaptation of Neural Machine Translation to New Languages

Neubig, Graham; Hu, Junjie

doi:10.18653/v1/d18-1103

Cited by 165 publications

(200 citation statements)

References 18 publications

Supporting

Mentioning

192

Contrasting

Order By: Relevance

“…Artificial noises for the source sentences are used to counteract word-by-word training data in unsupervised MT (Artetxe et al, 2018c;Lample et al, 2018a;Kim et al, 2018), but in this work, they are used to regularize the NMT. Neubig and Hu (2018) study adapting a multilingual NMT system to a new language. They train for a child language pair with additional parallel data of its similar language pair.…”

Section: Related Workmentioning

confidence: 99%

Effective Cross-lingual Transfer of Neural Machine Translation Models without Shared Vocabularies

Kim¹,

Gao²,

Ney³

2019

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

View full text Add to dashboard Cite

Transfer learning or multilingual model is essential for low-resource neural machine translation (NMT), but the applicability is limited to cognate languages by sharing their vocabularies. This paper shows effective techniques to transfer a pre-trained NMT model to a new, unrelated language without shared vocabularies.We relieve the vocabulary mismatch by using cross-lingual word embedding, train a more language-agnostic encoder by injecting artificial noises, and generate synthetic data easily from the pre-training data without back-translation. Our methods do not require restructuring the vocabulary or retraining the model. We improve plain NMT transfer by up to +5.1% BLEU in five low-resource translation tasks, outperforming multilingual joint training by a large margin. We also provide extensive ablation studies on pre-trained embedding, synthetic data, vocabulary size, and parameter freezing for a better understanding of NMT transfer.

show abstract

Section: Related Workmentioning

confidence: 99%

Effective Cross-lingual Transfer of Neural Machine Translation Models without Shared Vocabularies

Kim¹,

Gao²,

Ney³

2019

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

View full text Add to dashboard Cite

show abstract

“…A common challenge in applying natural language processing (NLP) techniques to low-resource languages is the lack of training data in the languages in question. It has been demonstrated that through cross-lingual transfer, it is possible to leverage one or more similar high-resource languages to improve the performance on the low-resource languages in several NLP tasks, including machine score(L tf,1 , L tk ) score (L tf,2 , L tk translation (Zoph et al, 2016;Johnson et al, 2017;Nguyen and Chiang, 2017;Neubig and Hu, 2018), parsing (Täckström et al, 2012;Ammar et al, 2016;Ahmad et al, 2019;, partof-speech or morphological tagging (Täckström et al, 2013;Cotterell and Heigold, 2017;Malaviya et al, 2018;Plank and Agić, 2018), named entity recognition (Zhang et al, 2016;Mayhew et al, 2017;Xie et al, 2018), and entity linking (Tsai and Roth, 2016;Rijhwani et al, 2019). There are many methods for performing this transfer, including joint training (Ammar et al, 2016;Tsai and Roth, 2016;Cotterell and Heigold, 2017;Johnson et al, 2017;Malaviya et al, 2018), annotation projection (Täckström et al, 2012;Täckström et al, 2013;Zhang et al, 2016;Plank and Agić, 2018), fine-tuning (Zoph et al, 2016;Neubig and Hu, 2018), data augmentation (Mayhew et al, 2017), or zero-shot transfer (Ahmad et al, 2019;Xie et al, 2018;Neubig and Hu, 2018;Rijhwani et al, 2019).…”

Section: Introductionmentioning

confidence: 99%

Choosing Transfer Languages for Cross-Lingual Learning

Lin¹,

Chen²,

Lee³

et al. 2019

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Self Cite

102

View full text Add to dashboard Cite

Cross-lingual transfer, where a high-resource transfer language is used to improve the accuracy of a low-resource task language, is now an invaluable tool for improving performance of natural language processing (NLP) on lowresource languages. However, given a particular task language, it is not clear which language to transfer from, and the standard strategy is to select languages based on ad hoc criteria, usually the intuition of the experimenter. Since a large number of features contribute to the success of cross-lingual transfer (including phylogenetic similarity, typological properties, lexical overlap, or size of available data), even the most enlightened experimenter rarely considers all these factors for the particular task at hand. In this paper, we consider this task of automatically selecting optimal transfer languages as a ranking problem, and build models that consider the aforementioned features to perform this prediction. In experiments on representative NLP tasks, we demonstrate that our model predicts good transfer languages much better than ad hoc baselines considering single features in isolation, and glean insights on what features are most informative for each different NLP tasks, which may inform future ad hoc selection even without use of our method. 1 * Equal contribution 1 Code, data, and pre-trained models are available at

show abstract

“…2 In this section, we demonstrate the types of analysis that are provided by this standard usage of compare-mt. Specifically, we use the example of comparing phrase-based (Koehn et al, 2003) and neural (Bahdanau et al, 2015) Slovak-English machine translation systems from Neubig and Hu (2018).…”

Section: Compare-mt Ref Sys1 Sys2mentioning

confidence: 99%

compare-mt: A Tool for Holistic Comparison of Language Generation Systems

Neubig

Dou

et al. 2019

Proceedings of the 2019 Conference of the North

Self Cite

View full text Add to dashboard Cite

In this paper, we describe compare-mt, a tool for holistic analysis and comparison of the results of systems for language generation tasks such as machine translation. The main goal of the tool is to give the user a high-level and coherent view of the salient differences between systems that can then be used to guide further analysis or system improvement. It implements a number of tools to do so, such as analysis of accuracy of generation of particular types of words, bucketed histograms of sentence accuracies or counts based on salient characteristics, and extraction of characteristic n-grams for each system. It also has a number of advanced features such as use of linguistic labels, source side data, or comparison of log likelihoods for probabilistic models, and also aims to be easily extensible by users to new types of analysis. compare-mt is a pure-Python open source package, 1 that has already proven useful to generate analyses that have been used in our published papers.

show abstract

Rapid Adaptation of Neural Machine Translation to New Languages

Cited by 165 publications

References 18 publications

Effective Cross-lingual Transfer of Neural Machine Translation Models without Shared Vocabularies

Effective Cross-lingual Transfer of Neural Machine Translation Models without Shared Vocabularies

Choosing Transfer Languages for Cross-Lingual Learning

compare-mt: A Tool for Holistic Comparison of Language Generation Systems

Contact Info

Product

Resources

About