Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2014
DOI: 10.3115/v1/p14-1066
|View full text |Cite
|
Sign up to set email alerts
|

Learning Continuous Phrase Representations for Translation Modeling

Abstract: This paper tackles the sparsity problem in estimating phrase translation probabilities by learning continuous phrase representations, whose distributed nature enables the sharing of related phrases in their representations. A pair of source and target phrases are projected into continuous-valued vector representations in a low-dimensional latent space, where their translation score is computed by the distance between the pair in this new space. The projection is performed by a neural network whose weights are … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

1
81
0

Year Published

2014
2014
2019
2019

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 102 publications
(82 citation statements)
references
References 33 publications
1
81
0
Order By: Relevance
“…Gao et al (2014a) successfully use an embedding model to refine the estimation of rare phrase-translation probabilities, which is traditionally affected by sparsity problems. Robustness to sparsity is a crucial property of our method, as it allows us to capture context information while avoiding unmanageable growth of model parameters.…”
Section: Related Workmentioning
confidence: 99%
“…Gao et al (2014a) successfully use an embedding model to refine the estimation of rare phrase-translation probabilities, which is traditionally affected by sparsity problems. Robustness to sparsity is a crucial property of our method, as it allows us to capture context information while avoiding unmanageable growth of model parameters.…”
Section: Related Workmentioning
confidence: 99%
“…Gao et al (2013) also use bag-ofwords but learn BLEU sensitive phrase embeddings. This kind of approaches does not take the word order into account and loses much information.…”
Section: Related Workmentioning
confidence: 99%
“…However, most of these methods focus on frequent words or an available bilingual phrase table (Zou et al, 2013;Zhang et al, 2014;Gao et al, 2014). Mikolov et al (2013a) learn a global linear projection from source to target using representation of frequent words on both sides.…”
Section: Related Workmentioning
confidence: 99%