Proceedings of the 14th International Conference on Extending Database Technology 2011
DOI: 10.1145/1951365.1951394
|View full text |Cite
|
Sign up to set email alerts
|

Efficient answering of set containment queries for skewed item distributions

Abstract: In this paper we address the problem of efficiently evaluating containment (i.e., subset, equality, and superset) queries over set-valued data. We propose a novel indexing scheme, the Ordered Inverted File (OIF) which, differently from the state-of-the-art, indexes setvalued attributes in an ordered fashion. We introduce query processing algorithms that practically treat containment queries as range queries over the ordered postings lists of OIF and exploit this ordering to quickly prune unnecessary page acces… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
20
0

Year Published

2013
2013
2021
2021

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 22 publications
(20 citation statements)
references
References 46 publications
0
20
0
Order By: Relevance
“…(1) Our empirical study showed that skewed data is challenging for our algorithms. Incorporation in our algorithms of recent results on efficiently dealing with list intersections and data skew should be investigated [5,35]. (2) It would be interesting to study the impact on our solutions of variations to the data model (e.g., multi-set and list types) or query model (e.g., bottom-up subtree embeddings [36]).…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…(1) Our empirical study showed that skewed data is challenging for our algorithms. Incorporation in our algorithms of recent results on efficiently dealing with list intersections and data skew should be investigated [5,35]. (2) It would be interesting to study the impact on our solutions of variations to the data model (e.g., multi-set and list types) or query model (e.g., bottom-up subtree embeddings [36]).…”
Section: Discussionmentioning
confidence: 99%
“…In the literature on processing containment queries on flat sets, solutions using inverted files as the physical representation of the database have demonstrated robust efficient performance [15,24,35]. 4 Furthermore, a variety of industrial-strength open-source solutions for building inverted files are available off the shelf, and are widely adopted and used by practitioners.…”
Section: Preliminariesmentioning
confidence: 99%
See 2 more Smart Citations
“…Experimental studies [22,44] showed that inverted files outperform signature-based indices for set containment queries on datasets with low cardinality set objects, e.g., typical text databases. In [37,38], the authors proposed extensions of the classic inverted file data structure, which optimize the indexing set-valued data with skewed item distribu-tions. In [14], the authors proposed an indexing scheme for text documents, which includes inverted lists for frequent word combinations.…”
Section: Set Containment Queriesmentioning
confidence: 99%