Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen 2019
DOI: 10.18653/v1/d19-1168
|View full text |Cite
|
Sign up to set email alerts
|

Hierarchical Modeling of Global Context for Document-Level Neural Machine Translation

Abstract: Document-level machine translation (MT) remains challenging due to the difficulty in efficiently using document context for translation. In this paper, we propose a hierarchical model to learn the global context for documentlevel neural machine translation (NMT). This is done through a sentence encoder to capture intra-sentence dependencies and a document encoder to model document-level intersentence consistency and coherence. With this hierarchical architecture, we feedback the extracted global document conte… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
66
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 48 publications
(66 citation statements)
references
References 16 publications
0
66
0
Order By: Relevance
“…Researchers propose various context-aware networks to utilize contextual information to improve the performance of DocNMT models on the translation quality (Jean et al, 2017;Tu et al, 2018;Kuang et al, 2018) or discourse phenomena (Bawden et al, 2018;Voita et al, 2019b,a). However, most methods roughly leverage all context sentences in a fixed size that is tuned on development sets (Wang et al, 2017;Miculicich et al, 2018;Yang et al, 2019;Xu et al, 2020) , or full context in the entire document (Maruf and Haffari, 2018;Tan et al, 2019;Kang and Zong, 2020;Zheng et al, 2020). They ignore the individualized needs for context when translating different source sentences.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Researchers propose various context-aware networks to utilize contextual information to improve the performance of DocNMT models on the translation quality (Jean et al, 2017;Tu et al, 2018;Kuang et al, 2018) or discourse phenomena (Bawden et al, 2018;Voita et al, 2019b,a). However, most methods roughly leverage all context sentences in a fixed size that is tuned on development sets (Wang et al, 2017;Miculicich et al, 2018;Yang et al, 2019;Xu et al, 2020) , or full context in the entire document (Maruf and Haffari, 2018;Tan et al, 2019;Kang and Zong, 2020;Zheng et al, 2020). They ignore the individualized needs for context when translating different source sentences.…”
Section: Related Workmentioning
confidence: 99%
“…Majority of existing DocNMT models set the context size or scope to be fixed. They utilize all of the previous k context sentences Miculicich et al, 2018;Voita et al, 2019b;Yang et al, 2019;Xu et al, 2020), or the full context in the entire document (Maruf and Haffari, 2018;Tan et al, 2019;Zheng et al, 2020). As a result, the inadequacy or redundancy of contextual information is almost inevitable.…”
Section: Introductionmentioning
confidence: 99%
“…Large-context encoder-decoder models: Large-context encoderdecoder models that can capture long-range linguistic contexts beyond sentence boundaries or utterance boundaries have received significant attention in E2E-ASR [7,8], machine translation [14,15], and some natural language generation tasks [16,17]. In recent studies, transformer-based large-context encoder-decoder models have been introduced in machine translation [18,19]. In addition, a fully transformer-based hierarchcal architecture similar to our transformer [20].…”
Section: Related Workmentioning
confidence: 99%
“…Voita et al [2019c] propose the CADec which demonstrates major gains over a context-agnostic baseline on their benchmarks without sacrificing BLEU. Tan et al [2019] propose a hierarchical model consisting of a sentence encoder to capture intra-sentence dependencies and a document encoder to model document-level information.…”
Section: Related Workmentioning
confidence: 99%
“…Despite the great success of these sequence-to-sequence models, they translate in a sentence-by-sentence manner, utilizing a large amount of sentence-level parallel data, while totally ignoring extra-sentential context information and intersentence consistency. This issue has attracted wide attention to context-aware translation recently, and many contextaware translation approaches [Wang et al, 2017;Tiedemann and Scherrer, 2017;Bawden et al, 2018;Voita et al, 2018;Maruf and Haffari, 2018;Kuang et al, 2018;Kuang and Xiong, 2018;Läubli et al, 2018;Miculicich et al, 2018;Voita et al, 2019c;Voita et al, 2019b;Xiong et al, 2019;Tan et al, 2019] are proposed.…”
Section: Introductionmentioning
confidence: 99%