Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer 2021
DOI: 10.18653/v1/2021.acl-long.408
|View full text |Cite
|
Sign up to set email alerts
|

Obtaining Better Static Word Embeddings Using Contextual Embedding Models

Abstract: The advent of contextual word embeddingsrepresentations of words which incorporate semantic and syntactic information from their context-has led to tremendous improvements on a wide variety of NLP tasks. However, recent contextual models have prohibitively high computational cost in many use-cases and are often hard to interpret. In this work, we demonstrate that our proposed distillation method, which is a simple extension of CBOW-based training, allows to significantly improve computational efficiency of NLP… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
17
0
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 18 publications
(19 citation statements)
references
References 47 publications
1
17
0
1
Order By: Relevance
“…This difference allows us to apply the alignment to many more languages than most related work. For example, Wang et al (2019) 3 Static Embeddings from XLM-R Gupta and Jaggi (2021) extracted English static embeddings from BERT and RoBERTa. They showed that their CBOW-like training scales better with more data and outperforms an aggregation approach to extracting static embeddings (Bommasani et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…This difference allows us to apply the alignment to many more languages than most related work. For example, Wang et al (2019) 3 Static Embeddings from XLM-R Gupta and Jaggi (2021) extracted English static embeddings from BERT and RoBERTa. They showed that their CBOW-like training scales better with more data and outperforms an aggregation approach to extracting static embeddings (Bommasani et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
“…We choose 40 languages for static embeddings extraction (full list in Appendix A). As the multilingual contextual model, we use XLM-R. From preliminary experimentation, we determined how best to extract multilingual embeddings from the model: First, using X2Static (Gupta and Jaggi, 2021) worked better than aggregation (Bommasani et al, 2020) even with a small amount of data. One important difference with Gupta and Jaggi's work is that for our task the sentence-level variant of X2Static worked better than the paragraph-level version.…”
Section: Extraction and Alignment Processmentioning
confidence: 99%
See 1 more Smart Citation
“…One of the goals of the paper selection was to extract the most relevant pre-trained word embedding models from the many that have been studied. While recent research on contextual embeddings has proven immensely beneficial, static embeddings remain crucial in many situations (Gupta and Jaggi, 2021). Many NLP applications fundamentally depend on static word embeddings for metrics that are designed non-contextual (Shoemark et al, 2019), such as examining word vector spaces (Vulic et al, 2020) and bias study (Gonen and Goldberg, 2019;Kaneko and Bollegala, 2019;Manzini et al, 2019).…”
Section: 31mentioning
confidence: 99%
“…One of the goals of the paper selection was to extract the most relevant pre-trained word embedding models from the many that have been studied. While recent research on contextual embeddings has proven immensely beneficial, static embeddings remain crucial in many situations (Gupta and Jaggi, 2021). Many NLP applications fundamentally depend on static word embeddings for metrics that are designed non-contextual (Shoemark et al, 2019), such as examining word vector spaces (Vulic et al, 2020) and bias study Manzini et al, 2019).…”
Section: 31mentioning
confidence: 99%