This article summarizes the theory of psychological relevance proposed by Dan Sperber and Deirdre Wilson (1988), to explicate the relevance of speech utterances to hearers in everyday conversation.The theory is then interpreted as the concept of relevance in information retrieval, and an extended example is presented. Implications of psychological relevance for research in information retrieval; evaluation of information retrieval systems; and the concepts of information, information need, and the information-seeking process are explored. Connections of the theory to ideas in bibliometrics are also suggested.
The purpose of this article is to bring attention to the problem of variations in relevance assessments and the effects that these may have on measures of retrieval effectiveness. Through an analytical review of the literature, I show that despite known wide variations in relevance assessments in experimental test collections, their effects on the measurement of retrieval performance are almost completely unstudied. I will further argue that what we know about the many variables that have been found to affect relevance assessments under experimental conditions, as well as our new understanding of psychological, situational, user-based relevance, point to a single conclusion. We can no longer rest the evaluation of information retrieval systems on the assumption that such variations do not significantly affect the measurement of information retrieval performance.A series of thorough, rigorous, and extensive tests is needed, of precisely how, and under what conditions, variations in relevance assessments do, and do not, affect measures of retrieval performance. We need to develop approaches to evaluation that are sensitive to these variations and to human factors and individual differences more generally. Our approaches to evaluation must reflect the real world of real users.
The problem studied in this research is that of developing a set of formal statistical rules for the purpose of identifying the keywords of a document‐words likely to be useful as index terms for that document. The research was prompted by the observation, made by a number of writers, that non‐specialty words, words which possess little value for indexing purposes, tend to be distributed at random in a collection of documents. In contrast, specialty words are not so distributed.
In Part I of the study, a mixture of two Poisson distributions is examined in detail as a model of specialty word distribution, and formulas expressing the three parameters of the model in terms of empirical frequency statistics are derived. The fit of the model is tested on an experimental document collection and found to be acceptable for the purposes of the study. A measure intended to identify specialty words, consistent with the 2‐Poisson model, is proposed and evaluated.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.