2016
DOI: 10.1515/pralin-2016-0013
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Word Alignment with Markov Chain Monte Carlo

Abstract: We present efmaral, a new system for efficient and accurate word alignment using a Bayesian model with Markov Chain Monte Carlo (MCMC) inference. Through careful selection of data structures and model architecture we are able to surpass the fast_align system, commonly used for performance-critical word alignment, both in computational efficiency and alignment accuracy. Our evaluation shows that a phrase-based statistical machine translation (SMT) system produces translations of higher quality when using word a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
47
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
5

Relationship

0
10

Authors

Journals

citations
Cited by 77 publications
(47 citation statements)
references
References 18 publications
(24 reference statements)
0
47
0
Order By: Relevance
“…We believe that one reason for the lack of consistent benefits of MTL in the labelling literature is that the proposed models share all or part of parameters for extracting hidden states, which leads to optimization conflicts when different tasks require different features. We believe it would be helpful if we make the model have ability to learn a task-specific representation (Ammar et al, 2016;Östling and Tiedemann, 2016;Kiperwasser and Ballesteros, 2018) at the same time. This observation led us to design a new LSTM cell which allows at almost no additional computation cost to efficiently train a single RNN-based model, where task-specific labelers clearly outperform their singly-tasked counterparts.…”
Section: Existing Shared Encoder Methodsmentioning
confidence: 99%
“…We believe that one reason for the lack of consistent benefits of MTL in the labelling literature is that the proposed models share all or part of parameters for extracting hidden states, which leads to optimization conflicts when different tasks require different features. We believe it would be helpful if we make the model have ability to learn a task-specific representation (Ammar et al, 2016;Östling and Tiedemann, 2016;Kiperwasser and Ballesteros, 2018) at the same time. This observation led us to design a new LSTM cell which allows at almost no additional computation cost to efficiently train a single RNN-based model, where task-specific labelers clearly outperform their singly-tasked counterparts.…”
Section: Existing Shared Encoder Methodsmentioning
confidence: 99%
“…First, we looked at the number of times the possessive pronoun son was translated by his or her in the training data. For this purpose, we used eflomal (Östling and Tiedemann, 2016) use the alignment link to find all possible translation of the French son token. 13 Results reported in Table 3 show that translations of son by his are three times more frequent than translations by her.…”
Section: Predicting Failurementioning
confidence: 99%
“…As this can be achieved by CombAlign, it makes WAScore an effective mechanism for measuring parallelism. CombAlign uses the following tools in our experiment; (i) AWESoME (Dou and Neubig, 2021), (ii) eflomal (Östling and Tiedemann, 2016), and (iii) fast_align (Dyer et al, 2013). WAScore is calculated for each sentence using Equation (2):…”
Section: Sentence Scoringmentioning
confidence: 99%