2005
DOI: 10.1002/asi.20130
|View full text |Cite
|
Sign up to set email alerts
|

Similarity measures, author cocitation analysis, and information theory

Abstract: The use of Pearson's correlation coefficient in Author Cocitation Analysis was compared with Salton's cosine measure in a number of recent contributions. Unlike the Pearson correlation, the cosine is insensitive to the number of zeros. However, one has the option of applying a logarithmic transformation in correlation analysis. Information calculus is based on both the logarithmic transformation and provides a non-parametric statistics. Using this methodology one can cluster a document set in a precise way and… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
59
0
2

Year Published

2005
2005
2016
2016

Publication Types

Select...
6
2

Relationship

3
5

Authors

Journals

citations
Cited by 106 publications
(61 citation statements)
references
References 17 publications
0
59
0
2
Order By: Relevance
“…The perfect rank-order correlation (r ϭ 1.00; p Ͻ .01) between the cosine matrix derived from the asymmetrical citation matrix, and the Jaccard index based on this same Leydesdorff (2005) discussed the advantages of using information measures for the precise calculation of distances using the same co-occurrence data. Information theory also is based on probability calculus (cf.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…The perfect rank-order correlation (r ϭ 1.00; p Ͻ .01) between the cosine matrix derived from the asymmetrical citation matrix, and the Jaccard index based on this same Leydesdorff (2005) discussed the advantages of using information measures for the precise calculation of distances using the same co-occurrence data. Information theory also is based on probability calculus (cf.…”
Section: Resultsmentioning
confidence: 99%
“…This led to discussions in previous issues of this journal about the pros and cons of using the Pearson correlation or other measures (Ahlgren, Jarneving, & Rousseau, 2004;Bensman, 2004;Leydesdorff, 2005;White, 2003White, , 2004. Leydesdorff and Vaughan (2006) used the same dataset to show why one should use the (asymmetrical) citation instead of the (symmetrical) co-citation matrix as the basis for the normalization.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Following the debate over the use of various similarity measures (Ahlgren et al, 2003(Ahlgren et al, , 2004White, 2003;Leydesdorff & Vaughan, 2006;Leydesdorff, 2008) and other techniques (Leydesdorff, 2005), we begin by testing Blondel et al's algorithm on the often-used case of 12 highly-cited authors working in the information retrieval field and 12 from the field of scientometrics or bibliometrics. Fig.…”
Section: Resultsmentioning
confidence: 99%
“…where This particular similarity measure is frequently used in text-classification applications, including, for example, the e-rater application, because it is known to be relatively insensitive to zero-frequency word classes (Leydesdorff, 2005). That is, the absence of a particular word class (e.g., bacteria|germ) does not indicate dissimilarity as strongly as the presence of a matching word class indicates similarity.…”
Section: Content Featuresmentioning
confidence: 99%