Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 2021
DOI: 10.18653/v1/2021.findings-acl.260
|View full text |Cite
|
Sign up to set email alerts
|

Combining Static Word Embeddings and Contextual Representations for Bilingual Lexicon Induction

Abstract: Bilingual Lexicon Induction (BLI) aims to map words in one language to their translations in another, and are typically through learning linear projections to align monolingual word representation spaces. Two classes of word representations have been explored for BLI: static word embeddings and contextual representations, but there is no studies to combine both. In this paper, we propose a simple yet effective mechanism to combine the static word embeddings and the contextual representations to utilize the adv… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
1

Year Published

2022
2022
2025
2025

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 10 publications
(7 citation statements)
references
References 28 publications
0
6
1
Order By: Relevance
“…CLWE approaches involve some levels of supervised alignment (Faruqui and Dyer, 2014;Zou et al, 2013), seed dictionaries (Artetxe et al, 2017;Gouws and Søgaard, 2015) or adversarial training (Lample et al, 2018;Artetxe et al, 2018;Zhang et al, 2017;Miceli Barone, 2016). Contexualized embeddings from language models have also been used in combination with static word embeddings to improve alignments of crosslingual word vectors (Aldarmaki and Diab, 2019;Zhang et al, 2021). Contrary to our findings for the token embeddings of LLMs, it was not clear that aligning word vectors is possible without some level of supervision, or to more than two languages at a time.…”
Section: Related Workcontrasting
confidence: 75%
“…CLWE approaches involve some levels of supervised alignment (Faruqui and Dyer, 2014;Zou et al, 2013), seed dictionaries (Artetxe et al, 2017;Gouws and Søgaard, 2015) or adversarial training (Lample et al, 2018;Artetxe et al, 2018;Zhang et al, 2017;Miceli Barone, 2016). Contexualized embeddings from language models have also been used in combination with static word embeddings to improve alignments of crosslingual word vectors (Aldarmaki and Diab, 2019;Zhang et al, 2021). Contrary to our findings for the token embeddings of LLMs, it was not clear that aligning word vectors is possible without some level of supervision, or to more than two languages at a time.…”
Section: Related Workcontrasting
confidence: 75%
“…Most commonly, due to reduced bilingual supervision requirements, the CLWEs are induced by (i) pretraining monolingual word embeddings independently in two languages, and then (ii) mapping them by linear (Mikolov et al, 2013;Xing et al, 2015;Joulin et al, 2018;Artetxe et al, 2018) or non-linear transformations Mohiuddin et al, 2020), minimising the distance between the original monolingual word embedding spaces. Optionally, such static CLWEs can be combined or enhanced with external word-level knowledge such as word translation knowledge embedded in multilingual language models (Zhang et al, 2021;Li et al, 2022;.…”
Section: Methodsmentioning
confidence: 99%
“…However, empirical evidence suggests that these approaches underperform static CLWEs for BLI (Vulić et al, 2020b): this is possibly because PLMs are primarily designed for longer sequence-level tasks and thus may naturally have inferior performance in word-level tasks when used off-the-shelf . Recent work started to combine static and contextualised word representations for BLI (Zhang et al, 2021). In fact, the previous SotA CLWEs for BLI, used as the baseline model in our work, are derived via a two-stage contrastive learning approach combining word representations of both types (Li et al, 2022).…”
Section: Related Workmentioning
confidence: 99%
“…When dealing with systems that have the potential to leverage dynamic word representations from various language models (e.g., CLBLI), we do not integrate these dynamic word representations. Because the advantages of static WEs and dynamic word representations bring in enhancing BLI are orthogonal (Zhang et al 2021).…”
Section: Training Setup and Hyperparametersmentioning
confidence: 99%