2020
DOI: 10.1162/tacl_a_00319
|View full text |Cite
|
Sign up to set email alerts
|

Better Document-Level Machine Translation with Bayes’ Rule

Abstract: We show that Bayes’ rule provides an effective mechanism for creating document translation models that can be learned from only parallel sentences and monolingual documents a compelling benefit because parallel documents are not always available. In our formulation, the posterior probability of a candidate translation is the product of the unconditional (prior) probability of the candidate output document and the “reverse translation probability” of translating the candidate output back into the sourc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
26
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
4
1

Relationship

2
8

Authors

Journals

citations
Cited by 34 publications
(27 citation statements)
references
References 25 publications
0
26
0
1
Order By: Relevance
“…In this case, attention diffuses over the relatively long MLB macro plan, leading to inaccurate content selection. We could alleviate this problem by adopting a noisy channel decomposition (Yee et al, 2019;Yu et al, 2020), i.e., by learning two different distributions: a conditional model which provides the probability of translating a paragraph plan to text and a language model which provides an unconditional estimate of the output (i.e., the whole game summary). However, we leave this to future work.…”
Section: Human-based Evaluationmentioning
confidence: 99%
“…In this case, attention diffuses over the relatively long MLB macro plan, leading to inaccurate content selection. We could alleviate this problem by adopting a noisy channel decomposition (Yee et al, 2019;Yu et al, 2020), i.e., by learning two different distributions: a conditional model which provides the probability of translating a paragraph plan to text and a language model which provides an unconditional estimate of the output (i.e., the whole game summary). However, we leave this to future work.…”
Section: Human-based Evaluationmentioning
confidence: 99%
“…Different from these approaches, G-Transformer uses a generic design for both source and context, translating whole document in one beam search instead of sentence-by-sentence. Some methods use a two-pass strategy, generating sentence translation first, integrating context information through a post-editing model (Voita et al, 2019a;Yu et al, 2020). In contrast, G-Transformer uses a single model, which reduces the complexity for both training and inference.…”
Section: Related Workmentioning
confidence: 99%
“…Another line of document-level NMT work (Xiong et al, 2018;Voita et al, 2019b) proposed a twopass document decoding model inspired by the deliberation network (Xia et al, 2017) in order to incorporate target side document context. A parallel line of work (Garcia et al, 2017(Garcia et al, , 2019Yu et al, 2019) introduced document-level approaches that do not require training the context-conditional NMT model by introducing a separate language model to enforce the consistency in the outputs of sentence-level NMT model. Garcia et al (2019) used a simple n-gram based semantic space language model (Hardmeier et al, 2012) to re-rank the outputs of the sentence-level NMT model inside the beam-search algorithm to enforce documentlevel consistency.…”
Section: Related Workmentioning
confidence: 99%