Learning from Post-Editing: Online Model Adaptation for Statistical Machine Translation

Denkowski, Michael; Dyer, Chris; Lavie, Alon

doi:10.3115/v1/e14-1042

Cited by 43 publications

(42 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These systems, however, are not able to leverage the feedback of the post-editors in an online translation scenario. The capability to evolve by learning from human feedback has been addressed by several online translation systems but mainly focusing on the MT task (Hardt and Elming, 2010;Bertoldi et al, 2013;Mathur et al, 2013;Simard and Foster, 2013;Ortiz-Martınez and Casacuberta, 2014;Denkowski et al, 2014;Wuebker et al, 2015). From these several online MT systems, we discuss the two that have been used also for the APE task.…”

Section: Related Workmentioning

confidence: 99%

“…Compared to the suffix arrays used to implement MT dynamic models (Germann, 2014;Denkowski et al, 2014), in which the whole sentence pairs are stored, our technique needs to save more information (all the translation options) but: i) the amount of data in APE is much less that in MT so it can be easily managed by ad hoc solutions, and ii) it allows us to collect global information at translation option level that can result in useful additional features for the model. This last aspect is explored in the next section, in which the reliability of the translation options is measured by looking at the behavior of the post-editors.…”

Section: Dynamic Knowledge Basementioning

confidence: 99%

See 1 more Smart Citation

Online Automatic Post-editing for MT in a Multi-Domain Translation Environment

Chatterjee

Gebremelak²,

Negri

et al. 2017

Proceedings of the 15th Conference of the European Chapter of The Association for Computational Linguistics: Volume 1

View full text Add to dashboard Cite

Automatic post-editing (APE) for machine translation (MT) aims to fix recurrent errors made by the MT decoder by learning from correction examples. In controlled evaluation scenarios, the representativeness of the training set with respect to the test data is a key factor to achieve good performance. Real-life scenarios, however, do not guarantee such favorable learning conditions. Ideally, to be integrated in a real professional translation workflow (e.g. to play a role in computerassisted translation framework), APE tools should be flexible enough to cope with continuous streams of diverse data coming from different domains/genres. To cope with this problem, we propose an online APE framework that is: i) robust to data diversity (i.e. capable to learn and apply correction rules in the right contexts) and ii) able to evolve over time (by continuously extending and refining its knowledge). In a comparative evaluation, with English-German test data coming in random order from two different domains, we show the effectiveness of our approach, which outperforms a strong batch system and the state of the art in online APE.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Dynamic Knowledge Basementioning

confidence: 99%

Online Automatic Post-editing for MT in a Multi-Domain Translation Environment

Chatterjee

Gebremelak²,

Negri

et al. 2017

Proceedings of the 15th Conference of the European Chapter of The Association for Computational Linguistics: Volume 1

View full text Add to dashboard Cite

show abstract

“…Since retraining the SMT model after each interaction is too costly, online adaptation after each interaction has become the learning protocol of choice for CAT. Online learning has been applied in generative SMT, e.g., using incremental versions of the EM algorithm (Ortiz-Martínez et al, 2010;Hardt and Elming, 2010), or in discriminative SMT, e.g., using perceptron-type algorithms (Cesa-Bianchi et al, 2008;Martínez-Gómez et al, 2012;Wäschle et al, 2013;Denkowski et al, 2014). In a similar way to deploying human feedback, extrinsic loss functions have been used to provide learning signals for SMT.…”

Section: Related Workmentioning

confidence: 99%

Response-based Learning for Grounded Machine Translation

Riezler

Simianer

Haas

2014

Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

View full text Add to dashboard Cite

We propose a novel learning approach for statistical machine translation (SMT) that allows to extract supervision signals for structured learning from an extrinsic response to a translation input. We show how to generate responses by grounding SMT in the task of executing a semantic parse of a translated query against a database. Experiments on the GEO-QUERY database show an improvement of about 6 points in F1-score for responsebased learning over learning from references only on returning the correct answer from a semantic parse of a translated query. In general, our approach alleviates the dependency on human reference translations and solves the reachability problem in structured learning for SMT.

show abstract

“…In the APE context, the input is a machine-translated segment (optionally with its corresponding source segment), which is processed by the online APE system to fix errors, and then verified by the post-editors. Several online translation systems have been proposed over the years (Hardt and Elming, 2010;Bertoldi et al, 2013;Mathur et al, 2013;Simard and Foster, 2013;Ortiz-Martïnez and Casacuberta, 2014;Denkowski et al, 2014;Wuebker et al, 2015).…”

Section: Online Translation Systemsmentioning

confidence: 99%

Online Automatic Post-Editing across Domains

Chatterjee¹,

Gebremelak²,

Negri³

et al. 2016

Proceedings of the Third Italian Conference on Computational Linguistics CLiC-it 2016

View full text Add to dashboard Cite

English. Recent advances in automatic post-editing (APE) have shown that it is possible to automatically correct systematic errors made by machine translation systems. However, most of the current APE techniques have only been tested in controlled batch environments, where training and test data are sampled from the same distribution and the training set is fully available. In this paper, we propose an online APE system based on an instance selection mechanism that is able to efficiently work with a stream of data points belonging to different domains. Our results on a mix of two datasets show that our system is able to: i) outperform stateof-the-art online APE solutions and ii) significantly improve the quality of rough MT output.Italiano. Recenti miglioramenti dei sistemi automatici di post-editing hanno dimostrato la loro capacità di correggere errori ricorrenti commessi dalla traduzione automatica. Spesso, tuttavia, tali sistemi sono stati valutati in condizioni controllate dove i dati di training/test sono selezionati dalla stessa distribuzione e l'insieme di trainingè interamente disponibile. Questo articolo propone un sistema di post-editing online, basato su tecniche di selezione dei dati, capace di trattare sequenze di dati appartenenti a diversi dominii. I risultati su un insieme di dati misti mostrano che il sistemaè in grado di ottenere risultati migliori rispetto i) allo stato dell'arte e ii) al sistema di traduzione.

show abstract

Learning from Post-Editing: Online Model Adaptation for Statistical Machine Translation

Cited by 43 publications

References 17 publications

Online Automatic Post-editing for MT in a Multi-Domain Translation Environment

Online Automatic Post-editing for MT in a Multi-Domain Translation Environment

Response-based Learning for Grounded Machine Translation

Online Automatic Post-Editing across Domains

Contact Info

Product

Resources

About