Proceedings of the 3rd Workshop on Evaluating Vector Space Representations For 2019
DOI: 10.18653/v1/w19-2002
|View full text |Cite
|
Sign up to set email alerts
|

Characterizing the Impact of Geometric Properties of Word Embeddings on Task Performance

Abstract: Analysis of word embedding properties to inform their use in downstream NLP tasks has largely been studied by assessing nearest neighbors. However, geometric properties of the continuous feature space contribute directly to the use of embedding features in downstream models, and are largely unexplored. We consider four properties of word embedding geometry, namely: position relative to the origin, distribution of features in the vector space, global pairwise distances, and local pairwise distances. We define a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(5 citation statements)
references
References 38 publications
0
5
0
Order By: Relevance
“…Because of their similar usage contexts in the training corpus, antonyms may o en have similar embedding vectors. [12] Another problem is that word embeddings will faithfully model human bias if it is found in the training corpus. Brunet et al [2] noted that "popular word embedding methods… acquire stereotypical human biases from the text data they are trained on. "…”
Section: Limitations Of Word Embeddingsmentioning
confidence: 99%
See 1 more Smart Citation
“…Because of their similar usage contexts in the training corpus, antonyms may o en have similar embedding vectors. [12] Another problem is that word embeddings will faithfully model human bias if it is found in the training corpus. Brunet et al [2] noted that "popular word embedding methods… acquire stereotypical human biases from the text data they are trained on. "…”
Section: Limitations Of Word Embeddingsmentioning
confidence: 99%
“…Word embeddings overall have two types of usage [12]: intrinsic tasks make use of the characteristics of the word embedding vector space to solve problems directly. For example, pair wise comparison using a distance metric can be used to nd words that are semantically or syntactically relate.…”
Section: Introductionmentioning
confidence: 99%
“…However, these tasks have well-documented issues that limit their value as an evaluation metric [29], [30]. We therefore followed prior work [31], [32] by augmenting our intrinsic evaluation with "extrinsic" evaluations that measure the quality of word embeddings by plugging them into another machine learning model that learns to use them as features. We evaluated on the following tasks: − Relation extraction: SemEval-2010 shared task 8 [33], using a CNN model with word and distance embeddings [34] for nine-way relation classification.…”
Section: Evaluation Metrics 1) Word2vec Evaluation Metricsmentioning
confidence: 99%
“…One possible way of connecting the disentanglement of latent representations to changes in task performance is by characterizing them in terms of geometrical properties. Such form of analysis has been explored in a non-contextual and nondisentangled setting (Whitaker et al, 2019;Kim and Linzen, 2019), in terms of consitency and sensitivity to distance or linear displacements, but we hypothetise that characterizing such relations in terms of consistency on vector operations in a contextual latent space (and in a dual manner, enforcing a geometrical consistency within this space) can support the construction of NLI models which are more interpretable, generalise better for complex inference, and have well defined robustness properties.…”
Section: Introductionmentioning
confidence: 99%