Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval 2013
DOI: 10.1145/2484028.2484088
|View full text |Cite
|
Sign up to set email alerts
|

Faster and smaller inverted indices with treaps

Abstract: We introduce a new representation of the inverted index that performs faster ranked unions and intersections while using less space. Our index is based on the treap data structure, which allows us to intersect/merge the document identifiers while simultaneously thresholding by frequency, instead of the costlier two-step classical processing methods. To achieve compression we represent the treap topology using compact data structures. Further, the treap invariants allow us to elegantly encode differentially bot… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2014
2014
2018
2018

Publication Types

Select...
4
4
1

Relationship

2
7

Authors

Journals

citations
Cited by 25 publications
(12 citation statements)
references
References 41 publications
0
12
0
Order By: Relevance
“…Although similar in spirit, these two non-parametric solutions use different techniques that are of independent interest (see Konow et al (2013) for a recent application of our techniques). We have shown some advantages of our proposal, such as independence on the encoding, a richer set of operations (e.g.…”
Section: Discussionmentioning
confidence: 99%
“…Although similar in spirit, these two non-parametric solutions use different techniques that are of independent interest (see Konow et al (2013) for a recent application of our techniques). We have shown some advantages of our proposal, such as independence on the encoding, a richer set of operations (e.g.…”
Section: Discussionmentioning
confidence: 99%
“…By also storing the priority data, they can answer top-k queries in O(k log k) or O(k log log n) time. The treap can also be used to compress the representation of keys and priorities [26]. Similar data structures for two or more dimensions are convenient only for dense grids (full of points) [27].…”
Section: Treaps Priority Search Trees and Ranked Range Queriesmentioning
confidence: 99%
“…A complex case includes positions of every occurrence of a term in a document, properties of that term, or even the results of an additional linguistic processing. In cases like this, based on term proximity, phrase queries and document scoring are provided [8]. The next example shows how to create and use an inverted index for some corpus of text documents.…”
Section: Inverted Index Structurementioning
confidence: 99%