Current methods for assessing the impact of authors and scientific media employ tools such as H-Index, Co-Citation and PageRank. These tools are primarily based on citation counting, which considers all citations to be equal. This type of methods can produce perverse incentives to publish controversial or incomplete papers, as mixed or negative reviews often generate larger citation counts and better indexes, regardless of whether the citations were critical or exerted minimal influence on the citing document. Passing citations that are employed to establish background, which do not have a real impact on the citing paper, are common in scientific literature. However, these citations have equal weight in impact evaluations. Notable researchers have emphasized the need to correct this situation by developing estimation methods that consider the different roles of quotations in citing papers. To accomplish this type of evaluation, a context citation analysis should be applied to determine the nature of the citations. We propose that citations should be categorized using four dimensions – FUNCTION, POLARITY, ASPECTS and INFLUENCE – as these dimensions provide adequate information that can be employed toward the generation of a qualitative method to measure the impact of a given publication in a citing paper. In this paper, we used interchangeably the words influence and impact. We present a method for obtaining this information using our proposed classification scheme and manually annotated corpus, which is marked with meaningful keywords and labels to help identify the characteristics or properties that constitute what we call ASPECTS. We develop a classification scheme which considers purpose definition shared by previous works. Our contribution is to abstract purpose classes from several other schemes and divide a complex structure in more manageable parts, to attain a simple system that combines low granularity dimensions but nevertheless produces a fine-grained classification. For annotators, the classification process is simple because in a first step, the coders distinguish only four primary classes, and in a second pass, they add the information contained in ASPECTS keyword and labels to obtain the more specific functions. This way, we gain a high granularity labeling that gives enough information about the citations to characterize and classify them, and we achieve this detailed coding with a straightforward process where the level of human error could be minimized.
In this paper we present a new method to improve the coverage of Passage Retrieval (PR) systems when these systems are employed for the Question Answering (QA) tasks. The ranking of passages obtained by the PR system is rearranged to emphasize those passages with more probability to contain the answer. The new ranking is based on finding the n-gram structures of the question that are presented in the passage, and the weight of the passages increases when they contain longer n-grams structures of the question. The results we present show that the application of this method improves notably the coverage of the classical PR system based on the Space Vectorial Model.We would like to thank CONACyT for partially supporting this work under the grant 43990A-1 as well as R2D2 CICYT (TIC2003-07158-C04-03) and ICT EU-India (ALA/95/23/2003/077-054) research projects. 1 http://clef.iei.pi.cnr.it/ V. Matoušek et al. (Eds.): TSD 2005, LNAI 3658, pp. 443-450, 2005. c Springer-Verlag Berlin Heidelberg 2005 444 José Manuel Gómez Soriano et al.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.