Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
DOI: 10.18653/v1/p19-1494
|View full text |Cite
|
Sign up to set email alerts
|

Bilingual Lexicon Induction through Unsupervised Machine Translation

Abstract: A recent research line has obtained strong results on bilingual lexicon induction by aligning independently trained word embeddings in two languages and using the resulting crosslingual embeddings to induce word translation pairs through nearest neighbor or related retrieval methods. In this paper, we propose an alternative approach to this problem that builds on the recent work on unsupervised machine translation. This way, instead of directly inducing a bilingual lexicon from cross-lingual embeddings, we use… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
30
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
4

Relationship

2
7

Authors

Journals

citations
Cited by 33 publications
(30 citation statements)
references
References 24 publications
0
30
0
Order By: Relevance
“…CLWE models-although similar in spirit to MONOTRANS-are only competitive on the easiest and smallest task (MLDoc), and perform poorly on the more challenging ones (XNLI and XQuAD). While previous work has questioned evaluation methods in this research area (Glavaš et al, 2019;Artetxe et al, 2019), our results provide evidence that existing methods are not competitive in challenging downstream tasks and that mapping between two fixed embedding spaces may be overly restrictive. For that reason, we think that designing better integration techniques of CLWE to downstream models is an important future direction.…”
Section: Transfer Of Monolingual Representationsmentioning
confidence: 54%
“…CLWE models-although similar in spirit to MONOTRANS-are only competitive on the easiest and smallest task (MLDoc), and perform poorly on the more challenging ones (XNLI and XQuAD). While previous work has questioned evaluation methods in this research area (Glavaš et al, 2019;Artetxe et al, 2019), our results provide evidence that existing methods are not competitive in challenging downstream tasks and that mapping between two fixed embedding spaces may be overly restrictive. For that reason, we think that designing better integration techniques of CLWE to downstream models is an important future direction.…”
Section: Transfer Of Monolingual Representationsmentioning
confidence: 54%
“…Similarly, in addition to being a downstream application of the former, unsupervised machine translation can also be useful to develop other multilingual applications or learn better crosslingual representations. This has previously been shown for supervised machine translation (McCann et al, 2017;Siddhant et al, 2019) and recently for bilingual lexicon induction (Artetxe et al, 2019a). In light of these connections, we call for a more holistic view of UCL, both from an experimental and theoretical perspective.…”
Section: Bridging the Gap Between Unsupervised Cross-lingual Learning Flavorsmentioning
confidence: 51%
“…Instead of performing non-linear regression, which is less effective in a large dimension of data, they sample the source language and map them using kernel mapping. Artetxe et al [20] proposed an approach to build bilingual lexicon by generating synthetic parallel data. The synthetic parallel data is used as resource in unsupervised machine translation to generate bilingual dictionary.…”
Section: Related Work 21 Bilingual Lexicon Extraction From Comparablmentioning
confidence: 99%