Faster and smaller inverted indices with treaps

Konow, Roberto; Navarro, Gonzalo; Clarke, Charles L. A.; López-Ortíz, Alejandro

doi:10.1145/2484028.2484088

Cited by 25 publications

(12 citation statements)

References 41 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Although similar in spirit, these two non-parametric solutions use different techniques that are of independent interest (see Konow et al (2013) for a recent application of our techniques). We have shown some advantages of our proposal, such as independence on the encoding, a richer set of operations (e.g.…”

Section: Discussionmentioning

confidence: 99%

On the compression of search trees

Claude

Nicholson

Seco

2014

Information Processing & Management

View full text Add to dashboard Cite

Let X = x 1 , x 2 , . . . , x n be a sequence of non-decreasing integer values. Storing a compressed representation of X that supports access and search is a problem that occurs in many domains. The most common solution to this problem uses a linear list and encodes the differences between consecutive values with encodings that favor small numbers. This solution includes additional information (i.e. samples) to support efficient searching on the encoded values. We introduce a completely different alternative that achieves compression by encoding the differences in a search tree. Our proposal has many applications, such as the representation of posting lists, geographic data, sparse bitmaps, and compressed suffix arrays, to name just a few. The structure is practical and we provide an experimental evaluation to show that it is competitive with the existing techniques.

show abstract

Section: Discussionmentioning

confidence: 99%

On the compression of search trees

Claude

Nicholson

Seco

2014

Information Processing & Management

View full text Add to dashboard Cite

show abstract

“…By also storing the priority data, they can answer top-k queries in O(k log k) or O(k log log n) time. The treap can also be used to compress the representation of keys and priorities [26]. Similar data structures for two or more dimensions are convenient only for dense grids (full of points) [27].…”

Section: Treaps Priority Search Trees and Ranked Range Queriesmentioning

confidence: 99%

Aggregated 2D range queries on clustered points

Brisaboa

Bernardo²,

Konow

et al. 2016

Information Systems

Self Cite

View full text Add to dashboard Cite

Efficient processing of aggregated range queries on two-dimensional grids is a common requirement in information retrieval and data mining systems, for example in Geographic Information Systems and OLAP cubes. We introduce a technique to represent grids supporting aggregated range queries that requires little space when the data points in the grid are clustered, which is common in practice. We show how this general technique can be used to support two important types of aggregated queries, which are ranked range queries and counting range queries. Our experimental evaluation shows that this technique can speed up aggregated queries up to more than an order of magnitude, with a small space overhead.

show abstract

“…A complex case includes positions of every occurrence of a term in a document, properties of that term, or even the results of an additional linguistic processing. In cases like this, based on term proximity, phrase queries and document scoring are provided [8]. The next example shows how to create and use an inverted index for some corpus of text documents.…”

Section: Inverted Index Structurementioning

confidence: 99%