Author cocitation analysis (ACA), a special type of cocitation analysis, was introduced by White and Griffith in 1981.This technique is used to analyze the intellectual structure of a given scientific field. In 1990, McCain published a technical overview that has been largely adopted as a standard. Here, McCain notes that Pearson's correlation coefficient (Pearson's r) is often used as a similarity measure in ACA and presents some advantages of its use. The present article criticizes the use of Pearson's r in ACA and sets forth two natural requirements that a similarity measure applied in ACA should satisfy. It is shown that Pearson's r does not satisfy these requirements. Real and hypothetical data are used in order to obtain counterexamples to both requirements. It is concluded that Pearson's r is probably not an optimal choice of a similarity measure in ACA. Still, further empirical research is needed to show if, and in that case to what extent, the use of similarity measures in ACA that fulfill these requirements would lead to objectively better results in full-scale studies. Further, problems related to incomplete cocitation matrices are discussed.
This paper builds on previous research concerned with the classification and specialty mapping of research fields. Two methods are put to test in order to decide if significant differences as to mapping results of the research front of a science field occur when compared. The first method was based on document co-citation analysis where papers citing co-citation clusters were assumed to reflect the research front. The second method was bibliographic coupling where likewise citing papers were assumed to reflect the research front. The application of these methods resulted in two different types of aggregations of papers: (1) groups of papers citing clusters of co-cited works and (2) clusters of bibliographically coupled papers. The comparision of the two methods as to mapping results was pursued by matching word profiles of groups of papers citing a particular cocitation cluster with word profiles of clusters of bibliographically coupled papers. Findings suggested that the research front was portrayed in two considerably different ways by the methods applied. It was concluded that the results in this study would support a further comparative study of these methods on a more detailed and qualitative ground.
This paper deals with two document-document similarity approaches in the context of science mapping: bibliographic coupling and a text approach based on the number of common abstract stems. We used 43 articles, published in the journal Information Retrieval, as test articles. An information retrieval expert performed a classification of these articles. We used the cosine measure for normalization, and the complete linkage method was used for clustering the articles. A number of articles pairs were ranked (1) according to descending normalized coupling strength, and (2) according to descending normalized frequency of common abstract stems. The degree of agreement between the two obtained rankings was low, as measured by Kendall's tau. The agreement between the two cluster solutions, one for each approach, was fairly low, according to the adjusted Rand index. However, there were examples of perfect agreement between the coupling solution and the stems solution. The classification generated by the expert contained larger groups compared to the coupling and stems solutions, and the agreement between the two solutions and the classification was not high. According to the adjusted Rand index, though, the stems solution was a better approximation of the classification than the coupling solution. With respect to cluster quality, the overall Silhouette value was slightly higher for the stems solution. Examples of homogeneous cluster structures, as well as negative Silhouette values, were found with regard to both solutions. The expert classification indicates that the field of information retrieval, as represented by one volume of articles published in Information Retrieval, is fairly heterogeneous regarding research themes, since the classification is associated with 15 themes. The complete linkage method, in combination with the upper tail rule, gave rise to a fairly good approximation of the classification with respect to the number of identified groups, especially in case of the stems approach.AHLGREN & JARNEVING: Bibliographic coupling, common abstract stems and clustering 274 Scientometrics 76 (2008)
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.