Proceedings of the 2018 Conference of the North American Chapter Of the Association for Computational Linguistics: Hu 2018
DOI: 10.18653/v1/n18-1190
|View full text |Cite
|
Sign up to set email alerts
|

Factors Influencing the Surprising Instability of Word Embeddings

Abstract: Despite the recent popularity of word embedding methods, there is only a small body of work exploring the limitations of these representations. In this paper, we consider one aspect of embedding spaces, namely their stability. We show that even relatively high frequency words (100-200 occurrences) are often unstable. We provide empirical evidence for how various factors contribute to the stability of word embeddings, and we analyze the effects of stability on downstream tasks.

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
101
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 81 publications
(105 citation statements)
references
References 23 publications
2
101
0
Order By: Relevance
“…The large drops in performance observed when using the CDE transformation is likely to relate 6 Due to their large vocabulary size, we were unable to run Thresholded-NNE experiments with word2vec embeddings. to the instability of nearest neighborhoods and the importance of locality in embedding learning (Wendlandt et al, 2018), although the effects of the autoencoder component also bear further investigation. By effectively increasing the size of the neighborhood considered, CDE adds additional sources of semantic noise.…”
Section: Analysis and Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…The large drops in performance observed when using the CDE transformation is likely to relate 6 Due to their large vocabulary size, we were unable to run Thresholded-NNE experiments with word2vec embeddings. to the instability of nearest neighborhoods and the importance of locality in embedding learning (Wendlandt et al, 2018), although the effects of the autoencoder component also bear further investigation. By effectively increasing the size of the neighborhood considered, CDE adds additional sources of semantic noise.…”
Section: Analysis and Discussionmentioning
confidence: 99%
“…This transformation relates to the common use of nearest neighborhoods as a proxy for semantic information (Wendlandt et al, 2018;Pierrejean and Tanguy, 2018). We take the previously proposed approach of combining the output of f NNE (v) for each v ∈ V to form a sparse adjacency matrix, which describes a directed nearest neighbor graph (Cuba Gyllensten and Sahlgren, 2015; Newman-Griffis and Fosler-Lussier, 2017), using three versions of f NNE defined below.…”
Section: Nearest Neighbor Encoding (Nne)mentioning
confidence: 99%
See 1 more Smart Citation
“…Prior research has noted instability of nearest neighborhoods in multiple embedding methods (Wendlandt et al, 2018). We therefore train 10 sets of embeddings from each of our subcorpora, each using the same hyperparameter settings but a different random seed.…”
Section: Identifying Concepts For Comparisonmentioning
confidence: 99%
“…Stability can be quantified by calculating the overlap between sets of words considered most similar in relation to pre-selected anchor words. Reasonable metrical choices are, e.g., the Jaccard coefficient (Jaccard, 1912) between these sets (Antoniak and Mimno, 2018;Chugh et al, 2018), or a percentage based coefficient (Hellrich and Hahn, 2016a,b;Wendlandt et al, 2018;Pierrejean and Tanguy, 2018). We here use j@n, i.e., the Jaccard coefficient for the n most similar words.…”
Section: Measuring Stabilitymentioning
confidence: 99%