Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing 2016
DOI: 10.18653/v1/d16-1096
|View full text |Cite
|
Sign up to set email alerts
|

Coverage Embedding Models for Neural Machine Translation

Abstract: In this paper, we enhance the attention-based neural machine translation (NMT) by adding explicit coverage embedding models to alleviate issues of repeating and dropping translations in NMT. For each source word, our model starts with a full coverage embedding vector to track the coverage status, and then keeps updating it with neural networks as the translation goes. Experiments on the large-scale Chinese-to-English task show that our enhanced model improves the translation quality significantly on various te… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
118
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 130 publications
(118 citation statements)
references
References 13 publications
0
118
0
Order By: Relevance
“…coverage (Tu et al, 2016;Mi et al, 2016;. This is why we say upper-bound frequency estimation, not just (exact) frequency.…”
Section: Effective Usagementioning
confidence: 99%
See 1 more Smart Citation
“…coverage (Tu et al, 2016;Mi et al, 2016;. This is why we say upper-bound frequency estimation, not just (exact) frequency.…”
Section: Effective Usagementioning
confidence: 99%
“…This issue has been discussed in the neural MT (NMT) literature as a part of a coverage problem (Tu et al, 2016;Mi et al, 2016). Such repeating generation behavior can become more severe in some NLG tasks than in MT.…”
Section: Introductionmentioning
confidence: 99%
“…Second, we strengthen the attentional network with a coverage vector accumulating the previous attentional information, similar to the work of Mi et al (2016) and Tu et al (2016b).…”
Section: Kitmentioning
confidence: 99%
“…For comparison, we used the baseline NMT system with soft coverage models (Mi et al, 2016;Tu NTCIR-10 , 2016b), which were used in first-pass decoding. 11 Whereas these studies used gated recurrent units (GRUs) (Chung et al, 2014) for the NMT and coverage models, we used LSTM.…”
Section: Setupmentioning
confidence: 99%
“…We introduced soft coverage models (Tu et al, 2016b;Mi et al, 2016) in Section 1. In addition to these published studies, there are several parallel related studies on arXiv (Wu et al, 2016;Li and Jurafsky, 2016;Tu et al, 2016a).…”
Section: Related Workmentioning
confidence: 99%