Better synchronous binarization for machine translation

Xiao, Tong; Zhang, Dongdong; Zhu, Jun; Zhou, Ming

doi:10.3115/1699510.1699558

Cited by 4 publications

(1 citation statement)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For SCFGs of arbitrary rank l N , translation complexity in time for hypergraphs becomes O(|G||s| l N +1 |M| l N +1 ); with FSAs the time complexity becomes O(e |G||s| l N +1 |M|); and with PDAs the time complexity becomes O(|G||s| l N +1 |M| 3 ). For more complex SCFGs with rules of rank greater than 2, such as SAMT (Zollmann and Venugopal 2006) or GHKM (Galley et al 2004), this suggests that PDA representations may offer computational advantages in the worst case relative to hypergraph representations, although this must be balanced against other available strategies such as binarization (Zhang et al 2006;Xiao et al 2009) or scope pruning (Hopkins and Langmead 2010). Of course, practical translation systems introduce various pruning procedures to achieve much better decoding efficiency than the worst cases given here.…”

Section: Shortest Pathmentioning

confidence: 99%

Pushdown Automata in Statistical Machine Translation

Allauzen

Byrne

Gispert

et al. 2014

Computational Linguistics

View full text Add to dashboard Cite

This article describes the use of pushdown automata (PDA) in the context of statistical machine translation and alignment under a synchronous context-free grammar. We use PDAs to compactly represent the space of candidate translations generated by the grammar when applied to an input sentence. General-purpose PDA algorithms for replacement, composition, shortest path, and expansion are presented. We describe HiPDT, a hierarchical phrase-based decoder using the PDA representation and these algorithms. We contrast the complexity of this decoder with a decoder based on a finite state automata representation, showing that PDAs provide a more suitable framework to achieve exact decoding for larger synchronous context-free grammars and smaller language models. We assess this experimentally on a large-scale Chinese-to-English alignment and translation task. In translation, we propose a two-pass decoding strategy involving a weaker language model in the first-pass to address the results of PDA complexity analysis. We study in depth the experimental conditions and tradeoffs in which HiPDT can achieve state-of-the-art performance for large-scale SMT.

show abstract