Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1) 2019
DOI: 10.18653/v1/w19-5321
|View full text |Cite
|
Sign up to set email alerts
|

Microsoft Translator at WMT 2019: Towards Large-Scale Document-Level Neural Machine Translation

Abstract: This paper describes the Microsoft Translator submissions to the WMT19 news translation shared task for English-German. Our main focus is document-level neural machine translation with deep transformer models. We start with strong sentence-level baselines, trained on large-scale data created via data-filtering and noisy back-translation and find that backtranslation seems to mainly help with translationese input. We explore fine-tuning techniques, deeper models and different ensembling strategies to counter th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
65
0
1

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 88 publications
(66 citation statements)
references
References 8 publications
0
65
0
1
Order By: Relevance
“…For the non-sliding window approaches with fixed maximum size, sentence splitting is not as straightforward and requires some additional treatment. Segments are also separated by separation tokens but we realized that they do not necessarily match with the segment boundaries in the reference data even though the original paper suggests that this should be rather stable (Junczys-Dowmunt, 2019). This is especially fatal if the number of segments does not match.…”
Section: Discussionmentioning
confidence: 86%
See 1 more Smart Citation
“…For the non-sliding window approaches with fixed maximum size, sentence splitting is not as straightforward and requires some additional treatment. Segments are also separated by separation tokens but we realized that they do not necessarily match with the segment boundaries in the reference data even though the original paper suggests that this should be rather stable (Junczys-Dowmunt, 2019). This is especially fatal if the number of segments does not match.…”
Section: Discussionmentioning
confidence: 86%
“…• we train different context-aware MT models (Tiedemann and Scherrer, 2017;Agrawal et al, 2018;Maruf et al, 2019;Junczys-Dowmunt, 2019) on the two datasets and evaluate them using a comparative setup with artificially scrambled data.…”
Section: Introductionmentioning
confidence: 99%
“…We use WMT14 and WMT19 newtests as validation and test sets respectively. The baseline system scores 37.99 BLEU on the full WMT19 newstest which compares favorably with strong single system baselines at WMT19 shared task Junczys-Dowmunt, 2019).…”
Section: Model Architecture and Implementation Detailsmentioning
confidence: 81%
“…Document-level machine translation (MT) systems are needed to deal with coreference as a whole. Although some attempts to include extrasentential information exist (Wang et al, 2017;Voita et al, 2018;Jean and Cho, 2019;Junczys-Dowmunt, 2019), the problem is far from being solved. Besides that, some further problems of NMT that do not seem to be related to coreference at first glance (such as translation of unknown words and proper names or the hallucination of additional words) cause coreference-related errors.…”
Section: Introductionmentioning
confidence: 99%