The objective of the paper is to amalgamate theories of text retrieval from various research traditions into a cognitive theory for information retrieval interaction. Set in a cognitive framework, the paper outlines the concept of polyrepresentation applied to both the user's cognitive space and the information space of IR systems. The concept seeks to represent the current user's information need, problem state, and domain work task or interest in a structure of causality. Further, it implies that we should apply different methods of representation and a variety of IR techniques of different cognitive and functional origin simultaneously to each semantic full‐text entity in the information space. The cognitive differences imply that by applying cognitive overlaps of information objects, originating from different interpretations of such objects through time and by type, the degree of uncertainty inherent in IR is decreased. Polyrepresentation and the use of cognitive overlaps are associated with, but not identical to, data fusion in IR. By explicitly incorporating all the cognitive structures participating in the interactive communication processes during IR, the cognitive theory provides a comprehensive view of these processes. It encompasses the ad hoc theories of text retrieval and IR techniques hitherto developed in mainstream retrieval research. It has elements in common with van Rijsbergen and Lalmas' logical uncertainty theory and may be regarded as compatible with that conception of IR. Epistemologically speaking, the theory views IR interaction as processes of cognition, potentially occurring in all the information processing components of IR, that may be applied, in particular, to the user in a situational context. The theory draws upon basic empirical results from information seeking investigations in the operational online environment, and from mainstream IR research on partial matching techniques and relevance feedback. By viewing users, source systems, intermediary mechanisms and information in a global context, the cognitive perspective attempts a comprehensive understanding of essential IR phenomena and concepts, such as the nature of information needs, cognitive inconsistency and retrieval overlaps, logical uncertainty, the concept of ‘document’, relevance measures and experimental settings. An inescapable consequence of this approach is to rely more on sociological and psychological investigative methods when evaluating systems and to view relevance in IR as situational, relative, partial, differentiated and non‐linear. The lack of consistency among authors, indexers, evaluators or users is of an identical cognitive nature. It is unavoidable, and indeed favourable to IR. In particular, for full‐text retrieval, alternative semantic entities, including Salton et al.'s ‘passage retrieval’, are proposed to replace the traditional document record as the basic retrieval entity. These empirically observed phenomena of inconsistency and of semantic entities and values associated with data interpretation support strongly a cognitive approach to IR and the logical use of polyrepresentation, cognitive overlaps, and both data fusion and data diffusion.
The paper describes the ideas and assumptions underlying the development of a new method for the evaluation and testing of interactive information retrieval (IR) systems, and reports on the initial tests of the proposed method. The method is designed to collect different types of empirical data, i.e. cognitive data as well as traditional systems performance data. The method is based on the novel concept of a 'simulated work task situation' or scenario and the involvement of real end users. The method is also based on a mixture of simulated and real information needs, and involves a group of test persons as well as assessments made by individual panel members. The relevance assessments are made with reference to the concepts of topical as well as situational relevance. The method takes into account the dynamic nature of information needs which are assumed to develop over time for the same user, a variability which is presumed to be strongly connected to the processes of relevance assessment.
This case study reports the investigations into the feasibility and reliability of calculating impact factors for web sites, called Web Impact Factors (Web-IF). The study analyses a selection of seven small and medium scale national and four large web domains as well as six institutional web sites over a series of snapshots taken of the web during a month. The data isolation and calculation methods are described and the tests discussed. The results thus far demonstrate that Web-IFs are calculable with high confidence for national and sector domains whilst institutional Web-IFs should be approached with caution. The data isolation method makes use of sets of inverted but logically identical Boolean set operations and their mean values in order to generate the impact factors associated with internal-(self-) link web pages and external-link web pages. Their logical sum is assumed to constitute the workable frequency of web pages linking up to the web location in question. The logical operations are necessary to overcome the variations in retrieval outcome produced by the AltaVista search engine.
This article introduces the application of informetric methods to the World Wide Web (WWW), also called Webometrics. A case study presents a workable method for general informetric analyses of the WWW. In detail, the paper describes a number of specific informetric analysis parameters. As a case study the Danish proportion of the WWW is compared to those of other Nordic countries. The methodological approach is comparable with common bibliometric analyses of the ISI citation databases. Among other results the analyses demonstrate that Denmark would seem to fall seriously behind the other Nordic countries with respect to visibility on the Net and compared to its position in scientific databases. 1. INTRODUCTION THE AIM OF THIS ARTICLE is to introduce and argue for the interesting idea that it is possible to utilise informetric methods on the World Wide Web (WWW). While informetrics is the research into information in a broad sense and not only limited to scientific communication, the approach taken here will be called Webometrics, which covers research of all network-based communication using informetric or other quantitative measures. It is obvious that informetric methods using word counts and similar techniques can be applied to the WWW. What is new is to regard the WWW as a citation network where the traditional information entities, and citations from them, are replaced by Web pages. These pages are the entities of information on the Web, with hyperlinks from them acting as citations. The use of informetric methods on the WWW is very exciting and allows for analyses to be carried out almost in the same way as is traditional in the citation databases. Until now these ideas have been used in only a few publications [1, 2]. The future for the use of informetric methods in the field of electronic communication was observed by William Paisley in 1990 [3, p. 286]:
In this article, we define webometrics within the framework of informetric studies and bibliometrics, as belonging to library and information science, and as associated with cybermetrics as a generic subfield. We develop a consistent and detailed link typology and terminology and make explicit the distinction among different Web node levels when using the proposed conceptual framework. As a consequence, we propose a novel diagram notation to fully appreciate and investigate link structures between Web nodes in webometric analyses. We warn against taking the analogy between citation analyses and link analyses too far.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.