Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing Volume 1 - EMNLP '09 2009
DOI: 10.3115/1699510.1699558
|View full text |Cite
|
Sign up to set email alerts
|

Better synchronous binarization for machine translation

Abstract: Binarization of Synchronous Context Free Grammars (SCFG) is essential for achieving polynomial time complexity of decoding for SCFG parsing based machine translation systems. In this paper, we first investigate the excess edge competition issue caused by a leftheavy binary SCFG derived with the method of Zhang et al. (2006). Then we propose a new binarization method to mitigate the problem by exploring other alternative equivalent binary SCFGs. We present an algorithm that iteratively improves the resulting bi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2010
2010
2015
2015

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 13 publications
0
1
0
Order By: Relevance
“…For SCFGs of arbitrary rank l N , translation complexity in time for hypergraphs becomes O(|G||s| l N +1 |M| l N +1 ); with FSAs the time complexity becomes O(e |G||s| l N +1 |M|); and with PDAs the time complexity becomes O(|G||s| l N +1 |M| 3 ). For more complex SCFGs with rules of rank greater than 2, such as SAMT (Zollmann and Venugopal 2006) or GHKM (Galley et al 2004), this suggests that PDA representations may offer computational advantages in the worst case relative to hypergraph representations, although this must be balanced against other available strategies such as binarization (Zhang et al 2006;Xiao et al 2009) or scope pruning (Hopkins and Langmead 2010). Of course, practical translation systems introduce various pruning procedures to achieve much better decoding efficiency than the worst cases given here.…”
Section: Shortest Pathmentioning
confidence: 99%
“…For SCFGs of arbitrary rank l N , translation complexity in time for hypergraphs becomes O(|G||s| l N +1 |M| l N +1 ); with FSAs the time complexity becomes O(e |G||s| l N +1 |M|); and with PDAs the time complexity becomes O(|G||s| l N +1 |M| 3 ). For more complex SCFGs with rules of rank greater than 2, such as SAMT (Zollmann and Venugopal 2006) or GHKM (Galley et al 2004), this suggests that PDA representations may offer computational advantages in the worst case relative to hypergraph representations, although this must be balanced against other available strategies such as binarization (Zhang et al 2006;Xiao et al 2009) or scope pruning (Hopkins and Langmead 2010). Of course, practical translation systems introduce various pruning procedures to achieve much better decoding efficiency than the worst cases given here.…”
Section: Shortest Pathmentioning
confidence: 99%