Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2008
DOI: 10.1145/1390334.1390376
|View full text |Cite
|
Sign up to set email alerts
|

A cluster-based resampling method for pseudo-relevance feedback

Abstract: Typical pseudo-relevance feedback methods assume the topretrieved documents are relevant and use these pseudo-relevant documents to expand terms. The initial retrieval set can, however, contain a great deal of noise. In this paper, we present a clusterbased resampling method to select better pseudo-relevant documents based on the relevance model. The main idea is to use document clusters to find dominant documents for the initial retrieval set, and to repeatedly feed the documents to emphasize the core topics … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
95
0
2

Year Published

2010
2010
2015
2015

Publication Types

Select...
9
1

Relationship

1
9

Authors

Journals

citations
Cited by 130 publications
(99 citation statements)
references
References 29 publications
2
95
0
2
Order By: Relevance
“…We note that the ranking and optimality of clusters can be improved with more sophisticated techniques [12,19,20,21, inter alia]. However, this is outside the scope of this paper.…”
Section: Clustering Web Resultsmentioning
confidence: 99%
“…We note that the ranking and optimality of clusters can be improved with more sophisticated techniques [12,19,20,21, inter alia]. However, this is outside the scope of this paper.…”
Section: Clustering Web Resultsmentioning
confidence: 99%
“…Under this hypothesis, a more accurate query model can be estimated from refined pseudo-feedback documents. For example, Lee et al (2008) proposed a resampling method by applying overlapping clusters to select dominant documents, which are connected with many subtopic clusters and have several highly similar documents. This resampling approach showed higher relevance density and better retrieval accuracy in their experiments.…”
Section: 2methods Of Optimizing Query Models In Language Modelingmentioning
confidence: 99%
“…The authors evaluate their algorithm on several small test collections, without achieving any improvements over the unexpanded queries. More recently, Lee, Croft, and Allan (2008) have shown that detecting clusters in a set of (pseudo-)relevant documents is helpful for identifying dominant documents for a query and, thus, for subsequent query expansion, a finding which was corroborated on different test collections by Kurland (2008). These approaches all exploit the notion that ''associations between documents convey information about the relevance of documents to requests" (Jardine & van Rijsbergen, 1971).…”
Section: Related Workmentioning
confidence: 98%