2016
DOI: 10.1017/s1351324916000127
|View full text |Cite
|
Sign up to set email alerts
|

End-to-end statistical machine translation with zero or small parallel texts

Abstract: We use bilingual lexicon induction techniques, which learn translations from monolingual texts in two languages, to build an end-to-end statistical machine translation (SMT) system without the use of any bilingual sentence-aligned parallel corpora. We present detailed analysis of the accuracy of bilingual lexicon induction, and show how a discriminative model can be used to combine various signals of translation equivalence (like contextual similarity, temporal similarity, orthographic similarity and topic sim… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
5
3
2

Relationship

1
9

Authors

Journals

citations
Cited by 18 publications
(14 citation statements)
references
References 26 publications
0
14
0
Order By: Relevance
“…Further, we experiment with orthogonal features, and with combining multiple source languages to get state of the art results on standard datasets. Irvine and Callison-Burch (2016) build a machine translation system for low-resource languages by inducing bilingual dictionaries from monolingual texts. Koehn and Knight (2001) experiment with varying knowledge levels on the task of translating German nouns in a small parallel German-English corpus.…”
Section: Othersmentioning
confidence: 99%
“…Further, we experiment with orthogonal features, and with combining multiple source languages to get state of the art results on standard datasets. Irvine and Callison-Burch (2016) build a machine translation system for low-resource languages by inducing bilingual dictionaries from monolingual texts. Koehn and Knight (2001) experiment with varying knowledge levels on the task of translating German nouns in a small parallel German-English corpus.…”
Section: Othersmentioning
confidence: 99%
“…As demonstrated by previous work [19,20], features based on the frequency of the phrases in the monolingual data may help to better estimate the similarity between two phrases. Indeed, we can expect that words or phrases that are translation of one another have a similar relative frequency in their respective language.…”
Section: Phrase Frequency and Phrase Lengthmentioning
confidence: 86%
“…As demonstrated by previous work (Irvine and Callison-Burch, 2014;Irvine and Callison-Burch, 2016), features based on the frequency of the phrases in the monolingual data may help us to better score a phrase pair. We add as features the inversed frequency of the source and target phrases in the in-domain monolingual data, along with their relative difference given by the following formula:…”
Section: Other Featuresmentioning
confidence: 88%