Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
DOI: 10.18653/v1/p19-1292
|View full text |Cite
|
Sign up to set email alerts
|

A Simple and Effective Approach to Automatic Post-Editing with Transfer Learning

Abstract: Automatic post-editing (APE) seeks to automatically refine the output of a black-box machine translation (MT) system through human post-edits. APE systems are usually trained by complementing human post-edited data with large, artificial data generated through backtranslations, a time-consuming process often no easier than training a MT system from scratch. In this paper, we propose an alternative where we fine-tune pre-trained BERT models on both the encoder and decoder of an APE system, exploring several par… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
11
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
2
2

Relationship

1
6

Authors

Journals

citations
Cited by 18 publications
(11 citation statements)
references
References 17 publications
0
11
0
Order By: Relevance
“…On the contrary, post-editing of system summaries through a set of basic operations such as insertion and deletion (Gu et al, 2019;Malmi et al, 2019;Dong et al, 2019b;Correia and Martins, 2019) may have intrinsic limitations by learning from single reference summaries to produce single outputs. In this paper, we provide a new dataset where each source text is associated with multiple admissible summaries to encourage diverse outputs.…”
Section: Vocabularymentioning
confidence: 99%
“…On the contrary, post-editing of system summaries through a set of basic operations such as insertion and deletion (Gu et al, 2019;Malmi et al, 2019;Dong et al, 2019b;Correia and Martins, 2019) may have intrinsic limitations by learning from single reference summaries to produce single outputs. In this paper, we provide a new dataset where each source text is associated with multiple admissible summaries to encourage diverse outputs.…”
Section: Vocabularymentioning
confidence: 99%
“…On the target side, following (Correia and Martins, 2019) we use a single decoder where the context attention block is initialized with the self attention weights, and all the weights of the self-attention are shared with the respective selfattention weights in the encoder.…”
Section: Bert-based Encoder-decodermentioning
confidence: 99%
“…Following (Correia and Martins, 2019) we adapt the BERT model to the APE task by integrating the model in an encoder-decoder architecture. To this aim we use a single BERT encoder to obtain a joint representation of the src and mt sentence and a BERT-based decoder where the multihead context attention block is initialized with the weights of the self-attention block.…”
Section: Bert-based Encoder-decodermentioning
confidence: 99%
See 1 more Smart Citation
“…To mitigate the data scarcity, addition of synthetic data to genuine data to expand the training data [6]- [8] has emerged as a possible solution. Especially, eSCAPE [7], a synthetic APE dataset made of parallel corpora, has been used extensively in many studies [2]- [4], [9], [10]. eSCAPE uses parallel corpora composed of bitexts -pairs of a source (src) and a reference (ref), to make a set of synthetic APE triplets: ⟨src, mt, ref ⟩, in which mt is the MT output of src, and ref serves as pe.…”
Section: Introductionmentioning
confidence: 99%