Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1) 2019
DOI: 10.18653/v1/w19-5351
|View full text |Cite
|
Sign up to set email alerts
|

Linguistic Evaluation of German-English Machine Translation Using a Test Suite

Abstract: We present the results of the application of a grammatical test suite for German→English MT on the systems submitted at WMT19, with a detailed analysis for 107 phenomena organized in 14 categories. The systems still translate wrong one out of four test items in average. Low performance is indicated for idioms, modals, pseudo-clefts, multi-word expressions and verb valency. When compared to last year, there has been a improvement of function words, non verbal agreement and punctuation. More detailed conclusions… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
12
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 16 publications
(17 citation statements)
references
References 10 publications
1
12
0
Order By: Relevance
“…The third rule that we conform to is to 1) create two contrastive source sentences for each lexical or syntactic ambiguity point, where each source sentence corresponds to one reasonable interpretation of the ambiguity point, and 2) to provide two contrastive translations for each created source sentence. This is similar to other linguistic evaluation by contrastive examples in the MT literature (Avramidis et al, 2019;Bawden et al, 2018;Müller et al, 2018;Sennrich, 2017). These two contrastive translations have similar wordings: one is correct and the other is not correct in that it translates the ambiguity part into the corresponding translation of the contrastive source sentence.…”
Section: Test Suite Designsupporting
confidence: 83%
“…The third rule that we conform to is to 1) create two contrastive source sentences for each lexical or syntactic ambiguity point, where each source sentence corresponds to one reasonable interpretation of the ambiguity point, and 2) to provide two contrastive translations for each created source sentence. This is similar to other linguistic evaluation by contrastive examples in the MT literature (Avramidis et al, 2019;Bawden et al, 2018;Müller et al, 2018;Sennrich, 2017). These two contrastive translations have similar wordings: one is correct and the other is not correct in that it translates the ambiguity part into the corresponding translation of the contrastive source sentence.…”
Section: Test Suite Designsupporting
confidence: 83%
“…German-to-English (Avramidis et al, 2019) The test suite by DFKI covers 107 grammatical phenomena organized into 14 categories. The test suite is very closely related to the one used last year (Macketanz et al, 2018), which allows an evaluation over time.…”
Section: Linguistic Evaluation Ofmentioning
confidence: 99%
“…In WMT 2019 English-German phenomena were tested with a new corpus, using both human and automatic evaluation. It is not possible, however, to use this evaluation outside the competition(Avramidis et al, 2019).…”
mentioning
confidence: 99%