2017
DOI: 10.14778/3151106.3151112
|View full text |Cite
|
Sign up to set email alerts
|

Estimating join selectivities using bandwidth-optimized kernel density models

Abstract: Accurately predicting the cardinality of intermediate plan operations is an essential part of any modern relational query optimizer. The accuracy of said estimates has a strong and direct impact on the quality of the generated plans, and incorrect estimates can have a negative impact on query performance. One of the biggest challenges in this field is to predict the result size of join operations. Kernel Density Estimation (KDE) is a statistical method to estimate multivariate probability distributions from a … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
55
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 58 publications
(55 citation statements)
references
References 33 publications
(62 reference statements)
0
55
0
Order By: Relevance
“…To support the shifts in workload and dataset, they update the bandwidth after each incoming query and design the new sample maintenance method for insert-only workload and updates/deletions workload. Furthermore, in Kiefer et al [48] extend the method into estimating the selectivity of join. They design two different models: single model over the join samples and the models over the base tables, which does not need the join operation and estimates the selectivity of join with the independent assumption.…”
Section: Unsupervised Methodsmentioning
confidence: 99%
“…To support the shifts in workload and dataset, they update the bandwidth after each incoming query and design the new sample maintenance method for insert-only workload and updates/deletions workload. Furthermore, in Kiefer et al [48] extend the method into estimating the selectivity of join. They design two different models: single model over the join samples and the models over the base tables, which does not need the join operation and estimates the selectivity of join with the independent assumption.…”
Section: Unsupervised Methodsmentioning
confidence: 99%
“…Therefore, it would be a natural idea of combining datadriven and query-driven models. As discussed before, the existing proposals leveraging both data and query workload [19,30,37,39] are insufficient towards this direction. An idea to overcome the problem of data-driven methods suffering the tail of the distribution due to their averaging optimization target would be using ensemble methods with each component targeting a different part of the distribution.…”
Section: Overviewmentioning
confidence: 99%
“…In fact, a few proposals (e.g., DeepDB) consider the combination as an interesting avenue for future work. Moreover, towards this direction several solutions [19,30,37,39] have been proposed to utilize both data and workload.…”
Section: Introductionmentioning
confidence: 99%
“…The problem of join selectivity estimation has been extensively studied in the relational database [1,7,15,16,18,33,34]. Particularly, these studies can be divided into three classes.…”
Section: Introductionmentioning
confidence: 99%