Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing 2018
DOI: 10.18653/v1/d18-1101
|View full text |Cite
|
Sign up to set email alerts
|

Improving Unsupervised Word-by-Word Translation with Language Model and Denoising Autoencoder

Abstract: Unsupervised learning of cross-lingual word embedding offers elegant matching of words across languages, but has fundamental limitations in translating sentences. In this paper, we propose simple yet effective methods to improve word-by-word translation of crosslingual embeddings, using only monolingual corpora but without any back-translation. We integrate a language model for context-aware search, and use a novel denoising autoencoder to handle reordering. Our system surpasses state-of-the-art unsupervised n… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
24
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 29 publications
(24 citation statements)
references
References 13 publications
0
24
0
Order By: Relevance
“…In addition, recent work has leveraged topological similarities between monolingual vector spaces to introduce fully unsupervised projectionbased CLE approaches that do not require any bilingual supervision (Zhang et al, 2017;Conneau et al, 2018a;Artetxe et al, 2018b;Alvarez-Melis and Jaakkola, 2018). Being conceptually attractive, such weakly supervised and unsupervised CLEs have taken the field by storm recently (Conneau et al, 2018a;Dou et al, 2018;Doval et al, 2018;Hoshen and Wolf, 2018;Kim et al, 2018;Chen and Cardie, 2018;Mukherjee et al, 2018;Nakashole, 2018;Xu et al, 2018;Alaux et al, 2019).…”
Section: Introductionmentioning
confidence: 99%
“…In addition, recent work has leveraged topological similarities between monolingual vector spaces to introduce fully unsupervised projectionbased CLE approaches that do not require any bilingual supervision (Zhang et al, 2017;Conneau et al, 2018a;Artetxe et al, 2018b;Alvarez-Melis and Jaakkola, 2018). Being conceptually attractive, such weakly supervised and unsupervised CLEs have taken the field by storm recently (Conneau et al, 2018a;Dou et al, 2018;Doval et al, 2018;Hoshen and Wolf, 2018;Kim et al, 2018;Chen and Cardie, 2018;Mukherjee et al, 2018;Nakashole, 2018;Xu et al, 2018;Alaux et al, 2019).…”
Section: Introductionmentioning
confidence: 99%
“…In each experiment we report the quality of the intermediate Translationese as well as the scores for our full model. Kim et al (2018) is the method closest to our work. We report the quality of Translationese as well as the scores for our full model.…”
Section: Methodsmentioning
confidence: 98%
“…The closest unsupervised NMT work to ours is by Kim et al (2018). Similar to us, they break translation into glossing and correction steps.…”
Section: Introductionmentioning
confidence: 90%
“…The input is the noisy version N (c) and the output is the cleaned sentence c, where c is a sentence sampled from the target monolingual corpus. Following Kim et al (2018), we construct N (c) by designing three noises: insertion, deletion, and reordering. Readers can refer to Kim et al (2018) for more technical explanations.…”
Section: Denoising Auto-encodermentioning
confidence: 99%
“…Following Kim et al (2018), we construct N (c) by designing three noises: insertion, deletion, and reordering. Readers can refer to Kim et al (2018) for more technical explanations.…”
Section: Denoising Auto-encodermentioning
confidence: 99%