2017 ACM/IEEE Joint Conference on Digital Libraries (JCDL) 2017
DOI: 10.1109/jcdl.2017.7991568
|View full text |Cite
|
Sign up to set email alerts
|

On the Various Semantics of Similarity in Word Embedding Models

Abstract: Finding similar words with the help of word embedding models has yielded meaningful results in many cases. However, the notion of similarity has remained ambiguous. In this paper, we examine when exactly similarity values in word embedding models are meaningful.To do so, we analyze the statistical distribution of similarity values systematically, in two series of experiments. The first one examines how the distribution of similarity values depends on the different embedding-model algorithms and parameters. The… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0
1

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 23 publications
(9 citation statements)
references
References 30 publications
0
8
0
1
Order By: Relevance
“…In this paper, we opt for the former, therefore we set the values as follows: embedding vector = 300, min_word_count = 10, window_size = 5, number_workers = 5, dict_size = 100,000. These values have been established to be a reasonable baseline setting for various semantic similarity tasks (Elekes et al , 2017, 2018; Hill et al , 2015). For the cosine similarity transformation function parameters ( a and c ) used in our second algorithm, we set them using cross-validation as a = {20, 30, …, 60} and for c = {0.9, 0.85, …, 0.7}.…”
Section: Methodsmentioning
confidence: 99%
“…In this paper, we opt for the former, therefore we set the values as follows: embedding vector = 300, min_word_count = 10, window_size = 5, number_workers = 5, dict_size = 100,000. These values have been established to be a reasonable baseline setting for various semantic similarity tasks (Elekes et al , 2017, 2018; Hill et al , 2015). For the cosine similarity transformation function parameters ( a and c ) used in our second algorithm, we set them using cross-validation as a = {20, 30, …, 60} and for c = {0.9, 0.85, …, 0.7}.…”
Section: Methodsmentioning
confidence: 99%
“…Note that the results can be transferred to the skip-gram learning algorithm of Word2Vec as well as to Glove (Pennington et al, 2014;Levy and Goldberg, 2014). This is because very recent work has shown that the distribution of the cosine distances among the word vectors of all such models (after normalization) is highly similar (Elekes et al, 2017).…”
Section: Realizations Of Word Embedding Modelsmentioning
confidence: 99%
“…There is related work that studies the impact of these parameters (Baroni et al, 2014;Hill et al, 2014;Elekes et al, 2017). Consequently, for all parameters mentioned except for the window size, we can rely on the results from the literature.…”
Section: Building a Word Embedding Modelmentioning
confidence: 99%
“…The training process additionally requires setting the values for the hyperparameters in such algorithms, particularly the vector size and context window size. In the literature on word-embeddings models (Elekes et al, 2017;Pennington et al, 2014) and applications (Banerjee et al, 2018;Kuzi et al, 2016; S. Li et al, 2018;Risch & Krestel, 2019), researchers normally experimented different values for such parameters and determined the values according to specific contexts and needs.…”
Section: Term Vectorizationmentioning
confidence: 99%