Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer 2021
DOI: 10.18653/v1/2021.acl-long.242
|View full text |Cite
|
Sign up to set email alerts
|

Online Learning Meets Machine Translation Evaluation: Finding the Best Systems with the Least Human Effort

Abstract: In Machine Translation, assessing the quality of a large amount of automatic translations can be challenging. Automatic metrics are not reliable when it comes to high performing systems. In addition, resorting to human evaluators can be expensive, especially when evaluating multiple systems. To overcome the latter challenge, we propose a novel application of online learning that, given an ensemble of Machine Translation systems, dynamically converges to the best systems, by taking advantage of the human feedba… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(11 citation statements)
references
References 31 publications
0
11
0
Order By: Relevance
“…The bandit algorithms learned from feedback simulated using a sentence-level BLEU score (Papineni et al 2002) between the selected automatic translation and a reference translation. Another one is a previous work of ours (Mendonça et al 2021), in which we framed the problem of dynamically converging to the performance of the best individual MT system as a problem of prediction with expert advice (when feedback is available to the translations outputted by all the systems in the ensemble) and adversarial multi-armed bandits (Robbins 1952;Lai and Robbins 1985) (when feedback is only available for the final translation chosen by the ensemble). We simulated the human-inthe-loop by using actual human ratings obtained from an MT shared task (Barrault et al 2019), when available in the data, and proposed different fallback strategies to cope with the lack of human ratings for some of the translations.…”
Section: Online Learning For Machine Translationmentioning
confidence: 99%
See 4 more Smart Citations
“…The bandit algorithms learned from feedback simulated using a sentence-level BLEU score (Papineni et al 2002) between the selected automatic translation and a reference translation. Another one is a previous work of ours (Mendonça et al 2021), in which we framed the problem of dynamically converging to the performance of the best individual MT system as a problem of prediction with expert advice (when feedback is available to the translations outputted by all the systems in the ensemble) and adversarial multi-armed bandits (Robbins 1952;Lai and Robbins 1985) (when feedback is only available for the final translation chosen by the ensemble). We simulated the human-inthe-loop by using actual human ratings obtained from an MT shared task (Barrault et al 2019), when available in the data, and proposed different fallback strategies to cope with the lack of human ratings for some of the translations.…”
Section: Online Learning For Machine Translationmentioning
confidence: 99%
“…We thus build upon our previous work (Mendonça et al 2021), since it is the only one that takes advantage of human quality ratings rather than translations or post-edits. This setting not only reduces the human effort involved in improving the MT ensemble, but it also should prove to be more suitable to represent real world MT scenarios, such as Web translation systems or MT shared tasks, in which the human-in-the-loop is not expected to provide a translation.…”
Section: Online Learning For Machine Translationmentioning
confidence: 99%
See 3 more Smart Citations