Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2) 2019
DOI: 10.18653/v1/w19-5412
|View full text |Cite
|
Sign up to set email alerts
|

Transformer-based Automatic Post-Editing Model with Joint Encoder and Multi-source Attention of Decoder

Abstract: This paper describes POSTECH's submission to the WMT 2019 shared task on Automatic Post-Editing (APE). In this paper, we propose a new multi-source APE model by extending Transformer. The main contributions of our study are that we 1) reconstruct the encoder to generate a joint representation of translation (mt) and its src context, in addition to the conventional src encoding and 2) suggest two types of multi-source attention layers to compute attention between two outputs of the encoder and the decoder state… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 13 publications
(9 citation statements)
references
References 11 publications
0
9
0
Order By: Relevance
“…For this data construction process, they used the parallel corpora and the NMT model released for the WMT20 Quality Estimation shared task. As APE model, they chose the sequential model proposed in (Lee et al, 2019), applying some minor modifications to increase the training efficiency. They submitted two ensemble models.…”
Section: Postechmentioning
confidence: 99%
“…For this data construction process, they used the parallel corpora and the NMT model released for the WMT20 Quality Estimation shared task. As APE model, they chose the sequential model proposed in (Lee et al, 2019), applying some minor modifications to increase the training efficiency. They submitted two ensemble models.…”
Section: Postechmentioning
confidence: 99%
“…A representative dataset that employs this method is eSCAPE [23]. Recent studies have also introduced high-performance APE models by applying this method [18,26]. As an alternative to utilizing a translation model, a noising scheme has been adopted to generate the MT in augmenting the APE triplet from the parallel corpus [19].…”
Section: Two Research Directions Of Ape a Backgroundmentioning
confidence: 99%
“…Motivated by the recent research on using the second decoder to do post-editing [19,20,21,22,23], we use the similar structure to achieve the goal of proofreading. As shown in Figure 2, we use the basic setting of the transformer decoder [18], and add an additional stacked multi-head attention layer after the original multi-head attention layer to deal with the phone embedding of the source speech.…”
Section: Decoder Fusionmentioning
confidence: 99%