We report the results of a British Library Research and Development Department funded design study for an interactive information retrieval system which will determine structural representations of the anomalous states of knowledge (ASKs) underlying information needs, and attempt to resolve the anomalies through a variety of retrieval strategies performed on a database of documents represented in compatible structural formats. Part I discusses the background to the project and the theory underlying it, Part II (next issue) presents our methods, results and conclusions. Basic premises of the project were: that information needs are not in principle precisely specifiable; that it is possible to elicit problem statements from information system users from which representations of the ASK underlying the need can be derived; that there are classes of ASKs; and, that all elements of information retrieval systems ought to be based on the user's ASK. We have developed a relatively freeform interview technique for eliciting problem statements, and a statistical word co-occurrence analysis for deriving network representations of the problem statements and abstracts. Structural characteristics of the representations have been used to determine classes of ASKs, and both ASK and information structures have been evaluated by, respectively, users and authors. Some results are: that interviewing appears to be a satisfactory technique for eliciting problem statements from which ASKs can be determined; that the statistical analysis produces structures which are generally appropriate both for documents and problem statements; that ASKs thus represented can be usefully classified according to their structural characteristics; and, that of thirty-five subjects, only two had ASKs for which traditional 'best match' retrieval would be intuitively appropriate. The results of the design study indicate that at least some of our premises are reasonable, and that an ASK-based information retrieval system is at least feasible.
INTRODUCTIONIMPROVEMENTS IN THE performance of information retrieval (IR) systems as presently designed seem to be limited to only marginal gains in terms of complete recall and precision or complete user satisfaction (see, e.g. Robertson and Sparck Jones1). In these two papers (Part II: Results of a design study, to appear in Journal of Documentation, vol. 38 no. 3) we report on a design study2 for an experimental IR system based on radically different hypotheses than those underlying present systems, which we think may allow the design of IR systems which produce significantly better performance than currently offered.
Information filtering systems are designed for unstructured or semistructured data, as opposed to database applications, which use very structured data. The systems also deal primarily with textual information, but they may also entail images, voice, video or other data types that are part of multimedia information systems. Information filtering systems also involve a large amount of data and streams of incoming data, whether broadcast from a remote source or sent directly by other sources. Filtering is based on descriptions of individual or group information preferences, or profiles, that typically represent long-term interests. Filtering also implies removal of data from an incoming stream rather than finding data in the stream; users see only the data that is extracted. Models of information retrieval and filtering, and lessons for filtering from retrieval research are presented.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.