Proceedings of the 2nd Workshop on Neural Machine Translation and Generation 2018
DOI: 10.18653/v1/w18-2711
|View full text |Cite
|
Sign up to set email alerts
|

Multi-Source Neural Machine Translation with Missing Data

Abstract: Multi-source translation systems translate from multiple languages to a single target language. By using information from these multiple sources, these systems achieve large gains in accuracy. To train these systems, it is necessary to have corpora with parallel text in multiple sources and the target language. However, these corpora are rarely complete in practice due to the difficulty of providing human translations in all of the relevant languages. In this paper, we propose a data augmentation approach to f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
21
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
6
2
1

Relationship

2
7

Authors

Journals

citations
Cited by 43 publications
(21 citation statements)
references
References 16 publications
0
21
0
Order By: Relevance
“…Some works target learning a universal representation for all languages either by leveraging semantic sharing between mapped word embeddings (Gu et al, 2018) or by using character n-gram embeddings (Wang et al, 2019) optimizing subword sharing. More related with data augmentation, Nishimura et al (2018) fill in missing data with a multi-source setting to boost multilingual translation.…”
Section: Related Workmentioning
confidence: 99%
“…Some works target learning a universal representation for all languages either by leveraging semantic sharing between mapped word embeddings (Gu et al, 2018) or by using character n-gram embeddings (Wang et al, 2019) optimizing subword sharing. More related with data augmentation, Nishimura et al (2018) fill in missing data with a multi-source setting to boost multilingual translation.…”
Section: Related Workmentioning
confidence: 99%
“…In all of their experiments, the multi-source methods significantly surpass the single-source baseline. Nishimura et al (2018) extend the former approach for situations when of the source languages is missing, so that the translation system does not overly rely on a single source language like some of the models presented in this work.…”
Section: Related Workmentioning
confidence: 99%
“…Domain adaptation: Some contributions examined regularization methods for adaptation and "extreme adaptation" to individual speakers (Michel and Neubig, 2018) Data augmentation: A number of the contributed papers examined ways to augment data for more efficient training. These include methods for considering multiple back translations , iterative back translation (Hoang et al, 2018b), bidirectional multilingual training (Niu et al, 2018), and document level adaptation (Kothur et al, 2018) Inadequate resources: Several contributions involved settings in which resources were insufficient, such as investigating the impact of noise , missing data in multi-source settings (Nishimura et al, 2018) and one-shot learning (Pham et al, 2018).…”
Section: Summary Of Research Contributionsmentioning
confidence: 99%