M. E. Maron scite author profile

This paper reports on a novel technique for literature indexing and searching in a mechanized library system. The notion of relevance is taken as the key concept in the theory of information retrieval and a comparative concept of relevance is explicated in terms of the theory of probability. The resulting technique called “Probabilistic Indexing,” allows a computing machine, given a request for information, to make a statistical inference and derive a number (called the “relevance number”) for each document, which is a measure of the probability that the document will satisfy the given request. The result of a search is an ordered list of those documents which satisfy the request ranked according to their probable relevance. The paper goes on to show that whereas in a conventional library system the cross-referencing (“see” and “see also”) is based solely on the “semantical closeness” between index terms, statistical measures of closeness between index terms can be defined and computed. Thus, given an arbitrary request consisting of one (or many) index term(s), a machine can elaborate on it to increase the probability of selecting relevant documents that would not otherwise have been selected. Finally, the paper suggests an interpretation of the whole library problem as one where the request is considered as a clue on the basis of which the library system makes a concatenated statistical inference in order to provide as an output an ordered list of those documents which most probably satisfy the information needs of the user.

show abstract

An evaluation of retrieval effectiveness for a full-text document-retrieval system

Blair

1985

View full text Add to dashboard Cite

show abstract

Automatic Indexing: An Experimental Inquiry

Maron

1961

J. ACM

445

193

View full text Add to dashboard Cite

This inquiry examines a technique for automatically classifying (indexing) documents according to their subject content. The task, in essence, is to have a computing machine read a document and on the basis of the occurrence of selected clue words decide to which of many subject categories the document in question belongs. This paper describes the design, execution and evaluation of a modest experimental study aimed at testing empirically one statistical technique for automatic indexing.

show abstract

On indexing, retrieval and the meaning of about

Maron

1977

J. Am. Soc. Inf. Sci.

113

View full text Add to dashboard Cite

The primary objective of this paper is to examine the concept of about as it is used in its information retrieval sense when, for example, an indexer judges that a document is (or is not) about some given subject. The problem with about is that it is a very complex notion and we are unable to say precisely what it is we do when we make judgment of aboutness. Since about is at the heart of indexing, how are we to formulate any proper theory of indexing if we cannot explicate precisely the key concept of about? In this paper we look at this concept of about and offer a solution to the problem mentioned; it consists of an operational definition of about which interprets about in terms of search behavior.A second objective of this paper is to show that about is, in fact, not the central concept in a theory of document retrieval. A document retrieval system ought to provide a ranked output (in response to a search query) not according to the degree that they are about the topic sought by the inquiring patron, but rather according to the probability that they will satisfy that person's information need. This paper shows how aboutness is related to probability of satisfaction.

show abstract

Foundations of Probabilistic and Utility-Theoretic Indexing

Cooper

Maron

1978

J. ACM

View full text Add to dashboard Cite

One of the most perplexing problems of reformation retrieval has been the estabhshment of rational criteria for deciding what index terms or descriptors to assign to a unit of stored information for purposes of later retrieval Both probablhstJc and utlhty-theoretlc criteria have m the past been proposed for thts purpose. The present paper derives explicit decision rules of both kinds from a common conceptual and mathematical foundation The result IS a unified theory of indexing KEY WORDS AND PHRASES indexing, cataloging, classification, index terms, descriptors, information retrieval, document retrieval, reference retrieval, utlhty-theoretlc indexing, probabthstlc indexing CR CATEGORIES" 3 70, 3 71, 3 72, 3 75The question of how to index documents is widely regarded as a mare issue, if indeed not the central theoretical problem, of the subfield of information retrieval known as document or reference retrieval. The problem setting is as follows. There exists a large document collection on the one hand, and on the other a population of individuals (potential retrieval system patrons) each of whom needs or wants reformation he thinks m~ght be supphed by documents m the collection. The indexing problem is: How should the documents in the collection be identified ("indexed," "cataloged," etc ) so that the collection can be searched to the maximal collective benefit of the patrons?In 1960 one of us (Maron), in collaboration with J.L. Kuhns, addressed this question and developed a theory of indexing known as Probabilistic Indexing ([14]; cf. [11,12]). The theory interpreted the indexing operation m such a way that a document retrieval system could use index information to compute and rank output documents according to the probabdity that each would satisfy the inqmring patron. More recently, Cooper developed another theory of indexing which might appropriately be called Utility-Theoretic Indexing since it ts based on the precepts of utility theory, including the rudiments of decision theory [5] Utihty-Theoretic Indexing is predicated on the assumption that index terms should be assigned to documents in such a way as to reflect the utility (or value) that the document m question would be expected to provide to the patron searching under the term in question. Related approaches have been explored by Bookstein and Swanson [1], Harter [8], Kraft [10], Kochen [9], and others. The purpose of this paper is to explain and clarify the conceptual foundations common to both Probabilistic and Utility-Theoretic Indexing, and to show how the two theories complement one another.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

M. E. Maron

On Relevance, Probabilistic Indexing and Information Retrieval

An evaluation of retrieval effectiveness for a full-text document-retrieval system

Automatic Indexing: An Experimental Inquiry

On indexing, retrieval and the meaning of about

Foundations of Probabilistic and Utility-Theoretic Indexing

Contact Info

Product

Resources

About