Sampling alignment structure under a Bayesian translation model

DeNero, John; Bouchard‐Côté, Alexandre; Klein, Dan

doi:10.3115/1613715.1613758

Cited by 43 publications

(34 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…[13] have successfully applied a similar Bayesian technique to grammar induction and [14], [15] have developed a tractable Bayesian methods for the more complex task of bilingual phrase pair extraction for SMT, which involves reordering. [16] tackle the overfitting problem in phrasal alignment by using a leave-one-out approach using a strategy that despite being a different paradigm, shares many of the characteristics of our approach.…”

Section: Motivationmentioning

confidence: 99%

“…More sophisticated methods of defining the base measure are possible, for example [14], [15] use the IBM model 1 likelihood of one phrase conditioned on the other in the base model to encourage the formation of bilingual pairs that follow the word alignments in the corpus. This idea Input: Random initial corpus segmentation Output: Unsupervised co-segmentation of the corpus according to the model foreach iter=1 to NumIterations do foreach bilingual word-pair w ∈ randperm(W) do foreach co-segmentation γ i of w do Compute probability p(γ i |h) where h is the set of data (excluding w) and its hidden co-segmentation end Sample a co-segmentation γ i from the distribution p(γ i |h) Update counts end end Algorithm 1: The blocked Gibbs sampling algorithm.…”

Section: The Base Measurementioning

confidence: 99%

See 1 more Smart Citation

A Bayesian Model of Transliteration and Its Human Evaluation When Integrated into a Machine Translation System

Finch¹,

Yasuda²,

Okuma³

et al. 2011

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

SUMMARYThe contribution of this paper is two-fold. Firstly, we conduct a large-scale real-world evaluation of the effectiveness of integrating an automatic transliteration system with a machine translation system. A human evaluation is usually preferable to an automatic evaluation, and in the case of this evaluation especially so, since the common machine translation evaluation methods are affected by the length of the translations they are evaluating, often being biassed towards translations in terms of their length rather than the information they convey. We evaluate our transliteration system on data collected in field experiments conducted all over Japan. Our results conclusively show that using a transliteration system can improve machine translation quality when translating unknown words. Our second contribution is to propose a novel Bayesian model for unsupervised bilingual character sequence segmentation of corpora for transliteration. The system is based on a Dirichlet process model trained using Bayesian inference through blocked Gibbs sampling implemented using an efficient forward filtering/backward sampling dynamic programming algorithm. The Bayesian approach is able to overcome the overfitting problem inherent in maximum likelihood training. We demonstrate the effectiveness of our Bayesian segmentation by using it to build a translation model for a phrase-based statistical machine translation (SMT) system trained to perform transliteration by monotonic transduction from character sequence to character sequence. The Bayesian segmentation was used to construct a phrase-table and we compared the quality of this phrase-table to one generated in the usual manner by the state-of-the-art GIZA++ word alignment process used in combination with phrase extraction heuristics from the MOSES statistical machine translation system, by using both to perform transliteration generation within an identical framework. In our experiments on English-Japanese data from the NEWS2010 transliteration generation shared task, we used our technique to bilingually co-segment the training corpus. We then derived a phrase-table from the segmentation from the sample at the final iteration of the training procedure, and the resulting phrase-table was used to directly substitute for the phrase-table extracted by using GIZA++/MOSES. The phrase-table resulting from our Bayesian segmentation model was approximately 30% smaller than that produced by the SMT system's training procedure, and gave an increase in transliteration quality measured in terms of both word accuracy and Fscore.

show abstract

Section: Motivationmentioning

confidence: 99%

Section: The Base Measurementioning

confidence: 99%

A Bayesian Model of Transliteration and Its Human Evaluation When Integrated into a Machine Translation System

Finch¹,

Yasuda²,

Okuma³

et al. 2011

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

show abstract

“…Most closely related is the work of DeNero et al (2008), who derive a Gibbs sampler for phrase-based alignment, using it to infer phrase translation probabilities.…”

Section: Related Workmentioning

confidence: 99%

Monte Carlo techniques for phrase-based translation

Arun

Haddow

Koehn

et al. 2010

Machine Translation

View full text Add to dashboard Cite

Abstract. Recent advances in statistical machine translation have used approximate beam search for NP-complete inference within probabilistic translation models. We present an alternative approach of sampling from the posterior distribution defined by a translation model. We define a novel Gibbs sampler for sampling translations given a source sentence and show that it effectively explores this posterior distribution. In doing so we overcome the limitations of heuristic beam search and obtain theoretically sound solutions to inference problems such as finding the maximum probability translation and minimum risk training and decoding.

show abstract

“…Bayesian inference plus the Dirichlet Process (DP) have been shown to effectively prevent MT models from overfitting the training data (DeNero et al, 2008;Blunsom et al, 2008). A similar approach can be applied here for SSMT by considering each TTS template as a cluster, and using DP to adjust the number of TTS templates according to the training data.…”

Section: Bayesian Inference With the Dirichlet Process Priormentioning

confidence: 99%

“…Non-parametric Bayesian methods have been successfully applied to directly learn phrase pairs from a bilingual corpus with little or no dependence on word alignments (Blunsom et al, 2008;DeNero et al, 2008). Because such approaches directly learn a generative model over phrase pairs, they are theoretically preferable to the standard heuristics for extracting the phrase pairs from the many-to-one word-level alignments produced by the IBM series models (Brown et al, 1993) or the Hidden Markov Model (HMM) (Vogel et al, 1996).…”

Section: Introductionmentioning

confidence: 99%

Bayesian learning of phrasal tree-to-string templates

Liu

Gildea

2009

Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing Volume 3 - EMNLP '09

View full text Add to dashboard Cite

We examine the problem of overcoming noisy word-level alignments when learning tree-to-string translation rules. Our approach introduces new rules, and reestimates rule probabilities using EM. The major obstacles to this approach are the very reasons that word-alignments are used for rule extraction: the huge space of possible rules, as well as controlling overfitting. By carefully controlling which portions of the original alignments are reanalyzed, and by using Bayesian inference during re-analysis, we show significant improvement over the baseline rules extracted from word-level alignments.

show abstract

Sampling alignment structure under a Bayesian translation model

Cited by 43 publications

References 17 publications

A Bayesian Model of Transliteration and Its Human Evaluation When Integrated into a Machine Translation System

A Bayesian Model of Transliteration and Its Human Evaluation When Integrated into a Machine Translation System

Monte Carlo techniques for phrase-based translation

Bayesian learning of phrasal tree-to-string templates

Contact Info

Product

Resources

About