Hypothesis Testing based Intrinsic Evaluation of Word Embeddings

Gurnani, Nishant

doi:10.48550/arxiv.1709.00831

Cited by 1 publication

(1 citation statement)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Using it on two sets of word vectors trained on the same corpus, one could compute the correlation between these sets using the cross-match test. If the correlation is low, then the two compared models probably use different features of the corpus, so it is probably a good result [Gurnani, 2017].…”

Section: Thesaurus Evaluationmentioning

confidence: 99%

A Survey of Word Embeddings Evaluation Methods

Bakarov

2018

Preprint

View full text Add to dashboard Cite

Word embeddings are real-valued word representations able to capture lexical semantics and trained on natural language corpora. Models proposing these representations have gained popularity in the recent years, but the issue of the most adequate evaluation method still remains open. This paper presents an extensive overview of the field of word embeddings evaluation, highlighting main problems and proposing a typology of approaches to evaluation, summarizing 16 intrinsic methods and 12 extrinsic methods. I describe both widely-used and experimental methods, systematize information about evaluation datasets and discuss some key challenges. Absence of correlation between intrinsic and extrinsic methods.Performance scores of word embeddings, when measured with two existing evaluation approaches (intrinsic and extrinsic), do not correlate between themselves. It is unclear what class of methods is more adequate. 4. Lack of significance tests. Statistical significance tests are sometimes not performed in the key experiments with new distributional models and evaluation methods. Thus, certain results of evaluation proposed in certain papers are not as correct as it is desirable. 5. The hubness problem. It is unclear how to deal with so-called hubs which are word vectors representing very frequent words. Such vectors are close to a disproportionately large number of other word vectors, hence, cosine distances between any two word vectors would probably be noised by the hubs, and the any evaluation in this case is biased.

show abstract

Section: Thesaurus Evaluationmentioning

confidence: 99%

A Survey of Word Embeddings Evaluation Methods

Bakarov

2018

Preprint

View full text Add to dashboard Cite

show abstract

Hypothesis Testing based Intrinsic Evaluation of Word Embeddings

Cited by 1 publication

References 0 publications

A Survey of Word Embeddings Evaluation Methods

A Survey of Word Embeddings Evaluation Methods

Contact Info

Product

Resources

About