2017
DOI: 10.1515/pralin-2017-0008
|View full text |Cite
|
Sign up to set email alerts
|

Learning Morphological Normalization for Translation from and into Morphologically Rich Languages

Abstract: When translating between a morphologically rich language (MRL) and English, word forms in the MRL often encode grammatical information that is irrelevant with respect to English, leading to data sparsity issues. This problem can be mitigated by removing from the MRL irrelevant information through normalization. Such preprocessing is usually performed in a deterministic fashion, using hand-crafted rules and yielding suboptimal representations. We introduce here a simple way to automatically compute an appropria… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2017
2017
2017
2017

Publication Types

Select...
4

Relationship

3
1

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 9 publications
0
5
0
Order By: Relevance
“…We used different costs for predicting lemmas and each tag, which are summed into a final objective function. As recently seen in (Martinez et al, 2016;Burlot and Yvon, 2017), these objectives are individually easier when working with morphologically-rich languages, and fully inflected words can be obtained by using morphological inflection models, which have been shown to be quite successful (Faruqui et al, 2016;Kann et al, 2017).…”
Section: Predicting Root and Tags Jointlymentioning
confidence: 94%
“…We used different costs for predicting lemmas and each tag, which are summed into a final objective function. As recently seen in (Martinez et al, 2016;Burlot and Yvon, 2017), these objectives are individually easier when working with morphologically-rich languages, and fully inflected words can be obtained by using morphological inflection models, which have been shown to be quite successful (Faruqui et al, 2016;Kann et al, 2017).…”
Section: Predicting Root and Tags Jointlymentioning
confidence: 94%
“…The specific setup we have used consisted in an architecture that enables training towards a dual objective: at each time-step in the output sentence, a normalized word and a PoStag are produced. To obtain the first factor vocabulary, all target words have been normalized (Burlot and Yvon, 2017a), i.e. all grammatical information that is redundant wrt.…”
Section: Kitmentioning
confidence: 99%
“…Normalization is usually performed using handcrafted rules and requires expert knowledge for each language pair. In this paper, normalized words are obtained with an automatic data-driven method 1 introduced in (Burlot and Yvon, 2017b).…”
Section: Normalizing Word Formsmentioning
confidence: 99%