2021
DOI: 10.1016/j.is.2020.101636
|View full text |Cite
|
Sign up to set email alerts
|

A large reproducible benchmark of ontology-based methods and word embeddings for word similarity

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
11
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
1
1

Relationship

2
6

Authors

Journals

citations
Cited by 14 publications
(11 citation statements)
references
References 45 publications
0
11
0
Order By: Relevance
“…Most cognitive functions, such as categorization, memory, decision-making, and reasoning, are based on human similarity and relatedness judgments between concepts. As a result, there is a large collection of human-labeled measure datasets to evaluate the degree of human-likeness from the standpoint of concept similarity and concept relatedness, particularly in the domains of natural language processing (Lastra-Diaz et al, 2021 ). To assess how well each type of representation reflects human judgments, we compute Spearman correlations between model-based similarity and human assessments, as is customary.…”
Section: The Gap Analysismentioning
confidence: 99%
“…Most cognitive functions, such as categorization, memory, decision-making, and reasoning, are based on human similarity and relatedness judgments between concepts. As a result, there is a large collection of human-labeled measure datasets to evaluate the degree of human-likeness from the standpoint of concept similarity and concept relatedness, particularly in the domains of natural language processing (Lastra-Diaz et al, 2021 ). To assess how well each type of representation reflects human judgments, we compute Spearman correlations between model-based similarity and human assessments, as is customary.…”
Section: The Gap Analysismentioning
confidence: 99%
“…All our experiments have been recorded into a Docker virtualization image that is provided as supplementary material together with our software [40] and a detailed reproducibility protocol [41] and dataset [42] to allow the easy replication of all our methods, experiments, and results. This work is based on our previous experience developing reproducible research in a series of publications in the area, such as the experimental surveys on word similarity introduced in [43][44][45][46], whose reproducibility protocols and datasets [47,48] are detailed and independently confirmed in two companion reproducible papers [38,49], and a reproducible benchmark on semantic measures libraries for the biomedical domain [39]. Finally, we refer the reader to our previous work [37] for a very detailed review of the literature on sentence similarity measures, which is omitted herein because of the lack of room and to avoid being redundant.…”
Section: Introductionmentioning
confidence: 99%
“…Our experiments will be based on our implementation and evaluation of all methods analyzed herein into a common and new software platform based on an extension of the Half-Edge Semantic Measures Library (HESML, http://hesml.lsi.uned.es ) [ 40 ], called HESML for Semantic Textual Similarity (HESML-STS), as well as their subsequent recording with the Reprozip long-term reproducibility tool [ 41 ]. This work is based on our previous experience developing reproducible research in a series of publications in the area, such as the experimental surveys on word similarity introduced in [ 42 45 ], whose reproducibility protocols and datasets [ 46 , 47 ] are detailed and independently confirmed in two reproducible papers [ 40 , 48 ]. The experiments in this new software platform will evaluate most of the sentence similarity methods for the biomedical domain reported in the literature, as well as a set of unexplored methods which are based on adaptations from the general language domain.…”
Section: Introductionmentioning
confidence: 99%