2019
DOI: 10.1007/s41019-019-00104-1
|View full text |Cite
|
Sign up to set email alerts
|

Selectivity Estimation on Set Containment Search

Abstract: In this paper, we study the problem of selectivity estimation on set containment search. Given a query record Q and a record dataset S , we aim to accurately and efficiently estimate the selectivity of set containment search of query Q over S. We first extend existing distinct value estimating techniques to solve this problem and develop an inverted list and G-KMV sketchbased approach IL-GKMV. We analyze that the performance of IL-GKMV degrades with the increase in vocabulary size. Motivated by limitations of … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(1 citation statement)
references
References 28 publications
(33 reference statements)
0
1
0
Order By: Relevance
“…Figure 1 illustrates the three most important components in a cost-based optimizer: cardinality estimation (CE), cost model (CM), and plan enumeration (PE). CE uses statistics of data and some assumptions about data distribution, column correlation, and join relationship to get the number of tuples generated by an intermediate operator, 1 which is also crucial for other search problems, e.g., [101,102]. CM can be regarded as a complex function that maps the current state of database and estimated cardinalities to the cost of executing a (sub)plan.…”
Section: Introductionmentioning
confidence: 99%
“…Figure 1 illustrates the three most important components in a cost-based optimizer: cardinality estimation (CE), cost model (CM), and plan enumeration (PE). CE uses statistics of data and some assumptions about data distribution, column correlation, and join relationship to get the number of tuples generated by an intermediate operator, 1 which is also crucial for other search problems, e.g., [101,102]. CM can be regarded as a complex function that maps the current state of database and estimated cardinalities to the cost of executing a (sub)plan.…”
Section: Introductionmentioning
confidence: 99%