Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of 2006
DOI: 10.3115/1220835.1220868
|View full text |Cite
|
Sign up to set email alerts
|

Synchronous binarization for machine translation

Abstract: Systems based on synchronous grammars and tree transducers promise to improve the quality of statistical machine translation output, but are often very computationally intensive. The complexity is exponential in the size of individual grammar rules due to arbitrary re-orderings between the two languages, and rules extracted from parallel corpora can be quite large. We devise a linear-time algorithm for factoring syntactic re-orderings by binarizing synchronous rules when possible and show that the resulting ru… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
76
0

Year Published

2006
2006
2016
2016

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 57 publications
(76 citation statements)
references
References 11 publications
0
76
0
Order By: Relevance
“…Zhang et al (2006) introduced a synchronous binarization technique that improved decoding efficiency and accuracy by ensuring that rule binarization avoided gaps on both the source and target sides (for rules where this was possible). Their binarization was designed to share binarized pieces among rules, but their approach to distributing weight was the default (nondiffused) case found in this paper to be least efficient: The entire weight of the original rule is placed at the top binarized rule and all internal rules are assigned a probability of 1.0.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Zhang et al (2006) introduced a synchronous binarization technique that improved decoding efficiency and accuracy by ensuring that rule binarization avoided gaps on both the source and target sides (for rules where this was possible). Their binarization was designed to share binarized pieces among rules, but their approach to distributing weight was the default (nondiffused) case found in this paper to be least efficient: The entire weight of the original rule is placed at the top binarized rule and all internal rules are assigned a probability of 1.0.…”
Section: Discussionmentioning
confidence: 99%
“…Increasing sharing reduces the amount of state that the parser must explore. Binarization has also been investigated in the context of parsing-based approaches to machine translation, where it has been shown that paying careful attention to the binarization scheme can produce much faster decoders (Zhang et al, 2006;Huang, 2007;DeNero et al, 2009).…”
Section: Introductionmentioning
confidence: 99%
“…For ease of presentation, and following synchronous-grammar based MT practice, we will henceforth restrict our focus to binary grammars (Zhang et al, 2006;Wang et al, 2007).…”
Section: Undirected Machine Translationmentioning
confidence: 99%
“…In Table 2, dot column stands for artificial anchor points in SL sentence, Lw and Rw for previous word and successive word of the current one respectively, and P, LHS, Lw, RHS and Rw constitute the syntactic reordering features of our model. Notice that, inspired by [1] and [11], we assume SL parse trees are binarized before fed into the tree-to-string transformation algorithm. [1] suggests binary-branching ITG rules prune seemingly unlikely and arbitrary word permutations but yet, at the same time, accommodate most meaningful structural reversals during translation.…”
Section: Tree-to-string Transformation Algorithmmentioning
confidence: 99%