2008
DOI: 10.1145/1361684.1361686
|View full text |Cite
|
Sign up to set email alerts
|

Interpreting TF-IDF term weights as making relevance decisions

Abstract: A novel probabilistic retrieval model is presented. It forms a basis to interpret the TF-IDF term weights as making relevance decisions. It simulates the local relevance decision-making for every location of a document, and combines all of these "local" relevance decisions as the "documentwide" relevance decision for the document. The significance of interpreting TF-IDF in this way is the potential to: (1) establish a unifying perspective about information retrieval as relevance decision-making; and (2) develo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
311
0
10

Year Published

2009
2009
2022
2022

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 650 publications
(323 citation statements)
references
References 69 publications
2
311
0
10
Order By: Relevance
“…Most of the content-based approaches focus on items which contain textual information such as news, books and other documents [15], [16]. Mooney et al [17] developed a book recommending system that utilizes semi-structured information about items gathered from the web using simple information extraction techniques.…”
Section: Related Workmentioning
confidence: 99%
“…Most of the content-based approaches focus on items which contain textual information such as news, books and other documents [15], [16]. Mooney et al [17] developed a book recommending system that utilizes semi-structured information about items gathered from the web using simple information extraction techniques.…”
Section: Related Workmentioning
confidence: 99%
“…As for why we are using only one artificial "addition" the same observation as in Case 2 stands. In this case the number of artificially added negative data points is one (δn A i = 1), so (5) can be transformed as (11):…”
Section: Calculation Of Weight Of Evidence For Binary Problems Whmentioning
confidence: 99%
“…The term frequency-inverse document frequency (TF-IDF), as described in [10] and [11], is often used in text mining problems as a numerical statistic which estimates the importance of a word to a document in a collection of documents. In a similar manner this weight can be used to transform arbitrary nominal values into numerical just as it assigns weight to words in text mining and information retrieval.…”
Section: Introductionmentioning
confidence: 99%
“…The application of both techniques to the body of the reports allows to extract the main clues in key-value pairs. Relevance of clues in each case is estimated in the next module, the "Clues relevance estimation", using relevance algorithm such as tf-idf [16]. The information at this point is expressed as:…”
Section: Fig 2 Clues Recommendation Algorithmmentioning
confidence: 99%