1987
DOI: 10.1002/(sici)1097-4571(198711)38:6<420::aid-asi3>3.0.co;2-s
|View full text |Cite
|
Sign up to set email alerts
|

Pictures of relevance: A geometric analysis of similarity measures

Abstract: We want computer systems that can help us assess the similarity or relevance of existing objects (e.g., documents, functions, commands, etc.) to a statement of our current needs (e.g., the query). Towards this end, a variety of similarity measures have been proposed. However, the relationship between a measure's formula and its performance is not always obvious. A geometric analy sis is advanced and its utility demonstrated through its application to six conventional information retrieval similarity measures a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
88
0
1

Year Published

1991
1991
2011
2011

Publication Types

Select...
8
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 201 publications
(90 citation statements)
references
References 13 publications
1
88
0
1
Order By: Relevance
“…The advantage of the cosine being not a statistic but a similarity measure then disappears. Formally, these two measures are equivalent, with the exception that Pearson normalizes for the arithmetic mean while the cosine does not use this mean as a parameter (Jones & Furnas, 1987). The cosine normalizes for the geometrical mean.…”
Section: Introductionmentioning
confidence: 99%
“…The advantage of the cosine being not a statistic but a similarity measure then disappears. Formally, these two measures are equivalent, with the exception that Pearson normalizes for the arithmetic mean while the cosine does not use this mean as a parameter (Jones & Furnas, 1987). The cosine normalizes for the geometrical mean.…”
Section: Introductionmentioning
confidence: 99%
“…Van Rijsbergen, 1979;Jones & Furnas, 1987;Ellis et al, 1993;Rorvig, 1999), and the choice of a specific measure may influence the outcome of the calculations. Van Rijsbergen, (1979), advised against the use of any measure that is not normalised by the length of the document vectors, something that was experimentally verified by Willett (1983).…”
Section: Introductionmentioning
confidence: 99%
“…These measures are useful to compute the neighborhood of a point and neighborhood-based measures but not for calculating similarity between a pair of data instances. In the area of information retrieval, Jones et al [12] and Noreault et. al [13] have studied several similarity measures.…”
Section: Related Workmentioning
confidence: 99%