Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.215
|View full text |Cite
|
Sign up to set email alerts
|

LNMap: Departures from Isomorphic Assumption in Bilingual Lexicon Induction Through Non-Linear Mapping in Latent Space

Abstract: Most of the successful and predominant methods for Bilingual Lexicon Induction (BLI) are mapping-based, where a linear mapping function is learned with the assumption that the word embedding spaces of different languages exhibit similar geometric structures (i.e., approximately isomorphic). However, several recent studies have criticized this simplified assumption showing that it does not hold in general even for closely related languages. In this work, we propose a novel semi-supervised method to learn cross-… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
36
0

Year Published

2021
2021
2025
2025

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 25 publications
(36 citation statements)
references
References 26 publications
0
36
0
Order By: Relevance
“…Most work on BLI learns a mapping between two static word embedding spaces, which are pretrained on large monolingual corpora (Ruder et al, 2019). Both linear mapping (Mikolov et al, 2013;Xing et al, 2015;Artetxe et al, 2016;Smith et al, 2017) and non-linear mapping (Mohiuddin et al, 2020) methods have been studied to align the two spaces. Recently, other than the static word embeddings, contextual representations are used for BLI due to the significant progresses on cross-lingual applications (Aldarmaki and Diab, 2019;Schuster et al, 2019).…”
Section: Introductionmentioning
confidence: 99%
“…Most work on BLI learns a mapping between two static word embedding spaces, which are pretrained on large monolingual corpora (Ruder et al, 2019). Both linear mapping (Mikolov et al, 2013;Xing et al, 2015;Artetxe et al, 2016;Smith et al, 2017) and non-linear mapping (Mohiuddin et al, 2020) methods have been studied to align the two spaces. Recently, other than the static word embeddings, contextual representations are used for BLI due to the significant progresses on cross-lingual applications (Aldarmaki and Diab, 2019;Schuster et al, 2019).…”
Section: Introductionmentioning
confidence: 99%
“…We show how a non-linear mapping (invertible neural network) can be trained with a cyclic consistency loss, showing that a common isomorphic assumption is not strictly necessary (Søgaard et al, 2018). The network trained has fewer parameters in comparison to Mohiuddin et al (2020) while providing equivalent or improved performance on the low-resource word translation task.…”
Section: Discussionmentioning
confidence: 99%
“…For baseline comparisons, we retrain VECMAP (Artetxe et al, 2016(Artetxe et al, , 2018, GeoMM (Jawanpuria et al, 2019) and RCSLS (Joulin et al, 2018). When possible, we compare with BLISS(R) (Patra et al, 2019), Joint Align (Wang et al, 2019), Cross-lingual Anchoring (Ormazabal et al, 2020) and LNMAP (Mohiuddin et al, 2020) using results previously reported for high resource languages. We train BDMA with a combination of cosine (C) and RCSLS (R) losses, and separate baseline methods for each language and translation direction pair.…”
Section: Experiments and Analysismentioning
confidence: 99%
See 1 more Smart Citation
“…Some more complex alignment methods like RCLS (Joulin et al, 2018) optimise for dictionary translation performance, which assumes isomorphism, but simpler methods like the orthogonal procrustes solution are more effective for downstream tasks like natural language inference (Glavaš et al, 2019). Mohiuddin et al (2020) propose a solution to the isomorphism problem by learning a new shared embedding space with an auto-encoding neural model instead of trying to fit the embeddings of one language in the space of another language.…”
Section: Aligning Word Embeddingsmentioning
confidence: 99%