Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen 2019
DOI: 10.18653/v1/d19-1089
|View full text |Cite
|
Sign up to set email alerts
|

Multilingual Neural Machine Translation with Language Clustering

et al.

Abstract: Multilingual neural machine translation (NMT), which translates multiple languages using a single model, is of great practical importance due to its advantages in simplifying the training process, reducing online maintenance costs, and enhancing low-resource and zero-shot translation. Given there are thousands of languages in the world and some of them are very different, it is extremely burdensome to handle them all in a single model or use a separate model for each language pair. Therefore, given a fixed res… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

5
184
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
3
1

Relationship

2
7

Authors

Journals

citations
Cited by 167 publications
(189 citation statements)
references
References 25 publications
5
184
0
Order By: Relevance
“…Our conclusions are similar to that of works that have attempted to cluster learned language vectors:Östling and Tiedemann (2016); Tan et al (2019) both find that hierarchical clusters of language vectors discover linguistic similarity, with the former finding fine-grained clusterings for Germanic languages. In a similar vein, Tiedemann (2018) visualizes language vectors and find that they roughly cluster by linguistic family.…”
Section: Representations Cluster By Language Similaritysupporting
confidence: 89%
See 1 more Smart Citation
“…Our conclusions are similar to that of works that have attempted to cluster learned language vectors:Östling and Tiedemann (2016); Tan et al (2019) both find that hierarchical clusters of language vectors discover linguistic similarity, with the former finding fine-grained clusterings for Germanic languages. In a similar vein, Tiedemann (2018) visualizes language vectors and find that they roughly cluster by linguistic family.…”
Section: Representations Cluster By Language Similaritysupporting
confidence: 89%
“…Recent work on interpretability for NLU tasks uses methods such as diagnostic tasks (Belinkov et al, 2017;Tenney et al, 2019;Belinkov et al, 2018), attention based methods (Raganato and Tiedemann, 2018) or task analysis (Zhang and Bowman, 2018) and is primarily focused on understanding the linguistic features encoded by a trained model. Some recent works compare learned language vectors (Östling and Tiedemann, 2016;Tan et al, 2019;Tiedemann, 2018) and find conclusions similar to ours. To the best of our knowledge, we are the first to compare the hidden representations of the sentences themselves.…”
Section: Svcca For Sequencessupporting
confidence: 83%
“…We expect that the balance between language-agnostic and language-specific representations should depend on the language pairs. Prasanna [117], Tan et al [141] are some of the works that cluster languages into language families and train separate MNMT models per family. Language families can be decided by using linguistic knowledge 7 [117] or by using embedding similarities where the embeddings are obtained from a multilingual word2vec model [141].…”
Section: Addressing Language Divergencementioning
confidence: 99%
“…In reinforcement learning, knowledge distillation has been used to regularize multi-task agents (Parisotto et al, 2016;Teh et al, 2017). In NLP, Tan et al (2019) distill singlelanguage-pair machine translation systems into a many-language system. However, they focus on multilingual rather than multi-task learning, use a more complex training procedure, and only experiment with Single→Multi distillation.…”
Section: Related Workmentioning
confidence: 99%