Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen 2019
DOI: 10.18653/v1/d19-1570
|View full text |Cite
|
Sign up to set email alerts
|

Understanding Data Augmentation in Neural Machine Translation: Two Perspectives towards Generalization

Abstract: Many Data Augmentation (DA) methods have been proposed for neural machine translation. Existing works measure the superiority of DA methods in terms of their performance on a specific test set, but we find that some DA methods do not exhibit consistent improvements across translation tasks. Based on the observation, this paper makes an initial attempt to answer a fundamental question: what benefits, which are consistent across different methods and tasks, does DA in general obtain? Inspired by recent theoretic… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 19 publications
(14 citation statements)
references
References 21 publications
0
14
0
Order By: Relevance
“…A translation model is then trained using both the pseudo-parallel and original-parallel data. Li et al (2019) analyzed multiple data augmentation methods. In their experiments, they applied self-training and back-translation.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…A translation model is then trained using both the pseudo-parallel and original-parallel data. Li et al (2019) analyzed multiple data augmentation methods. In their experiments, they applied self-training and back-translation.…”
Section: Related Workmentioning
confidence: 99%
“…First, we back-translate the target-side of the parallel corpus (Li et al, 2019;Sennrich et al, 2016) to create pseudo data as additional training data. Note that we do not use external data in backtranslation, and the diversity of target sentences does not change.…”
Section: Data Augmentation By Sentence Concatenationmentioning
confidence: 99%
See 1 more Smart Citation
“…Margin (Bartlett et al, 2017) is a classic concept in support vector machine, measuring the geometric distance between the support vectors and the decision boundary. To apply margin for NMT models, we follow Li et al (2019) to compute wordwise margin, which is defined as the probability of the correctly predicted word minus the maximum probability of other word types. We compute the word-wise margin over the training set and report the averaged value.…”
Section: A3 Generalization Capabilitymentioning
confidence: 99%
“…Similar steps are performed for IWSLT17 Fr-En. For the Zh-En dataset, we mainly adopt the same data-split method as Nguyen, Daumé, and Boyd-Graber (2017) and Li et al (2019), except that we use both their 'Supervised training' and 'Bandit training' sets as the training set. We use the Stanford Chinese word segmenter (Chang, Galley, and Manning 2008) to segment Chinese sentences.…”
mentioning
confidence: 99%