2020
DOI: 10.1007/978-3-030-63031-7_15
|View full text |Cite
|
Sign up to set email alerts
|

A Mixed Learning Objective for Neural Machine Translation

Abstract: Evaluation discrepancy and overcorrection phenomenon are two common problems in neural machine translation (NMT). NMT models are generally trained with word-level learning objective, but evaluated by sentence-level metrics. Moreover, the cross-entropy loss function discourages model to generate synonymous predictions and overcorrect them to ground truth words. To address these two drawbacks, we adopt multi-task learning and propose a mixed learning objective (MLO) which combines the strength of word-level and … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
7
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(7 citation statements)
references
References 17 publications
0
7
0
Order By: Relevance
“…Despite the difference in their inputs and neural models, those approaches are similar in the sense that they are all based on a typical NMT model formulated as an encoder-decoderattention architecture optimized with cross-entropy [11], [15], [16]. As a matter of fact, all the prior works on program repair are based on the NMT architecture with a cross-entropy loss [4]- [9].…”
Section: A Neural Program Repairmentioning
confidence: 99%
See 4 more Smart Citations
“…Despite the difference in their inputs and neural models, those approaches are similar in the sense that they are all based on a typical NMT model formulated as an encoder-decoderattention architecture optimized with cross-entropy [11], [15], [16]. As a matter of fact, all the prior works on program repair are based on the NMT architecture with a cross-entropy loss [4]- [9].…”
Section: A Neural Program Repairmentioning
confidence: 99%
“…The cross-entropy loss (a.k.a log loss) is a measure from information theory, building upon entropy and calculating the difference between two probability distributions. In sequence generation, the cross-entropy loss calculates the difference between the generated tokens and the human-written patch tokens in a strict pairwise matching manner [11], [16], [17]. In program repair patches, a low cross-entropy value means that the generated patch is syntactically close to the ground truth patch at the token level.…”
Section: A Neural Program Repairmentioning
confidence: 99%
See 3 more Smart Citations