Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval 2011
DOI: 10.1145/2009916.2010048
|View full text |Cite
|
Sign up to set email alerts
|

Faster top-k document retrieval using block-max indexes

Abstract: Large search engines process thousands of queries per second over billions of documents, making query processing a major performance bottleneck. An important class of optimization techniques called early termination achieves faster query processing by avoiding the scoring of documents that are unlikely to be in the top results. We study new algorithms for early termination that outperform previous methods. In particular, we focus on safe techniques for disjunctive queries, which return the same result as an ex… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
169
1

Year Published

2016
2016
2019
2019

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 182 publications
(171 citation statements)
references
References 41 publications
1
169
1
Order By: Relevance
“…ese techniques include MaxScore [39], WAND [6], and BMW [17]. In this paper, we focus our a ention on the WAND strategy, since we deal with (rewri en) queries that can have a large number of terms (i.e., long queries).…”
Section: Preliminariesmentioning
confidence: 99%
“…ese techniques include MaxScore [39], WAND [6], and BMW [17]. In this paper, we focus our a ention on the WAND strategy, since we deal with (rewri en) queries that can have a large number of terms (i.e., long queries).…”
Section: Preliminariesmentioning
confidence: 99%
“…Such maximum score could be signi cantly larger than the typical score contribution of that term, in fact limiting the opportunities to skip large amounts of documents. To tackle this problem, Ding and Suel [6] proposed to augment the inverted index data structures with additional information to store more accurate upper bounds: at indexing time each posting list is split into consecutive blocks of constant size, e.g. 128 postings per block.…”
Section: Query Processingmentioning
confidence: 99%
“…ese block term upper bounds can then be exploited by adapting existing algorithms such as Wand to make use of the additional information. e resulting algorithm is BlockMaxWand (BMW) [6]. e authors reported an average query response time reduction of BMW compared to Wand of 64% -67%.…”
Section: Query Processingmentioning
confidence: 99%
See 1 more Smart Citation
“…For example, modern search engines use the set intersection for inverted posting list which is a standard data structure in information retrieval to return relevant documents. So it has been studied in many domains and fields [3]- [8]. One of the most typical situations is Boolean query which is required to retrieval the documents that contains all the terms in the query.…”
Section: Introductionmentioning
confidence: 99%