How I Learned to Stop Worrying and Love Re-optimization

Perron, Matthew; Shang, Zeyuan; Kraska, Tim

doi:10.1109/icde.2019.00191

Cited by 20 publications

(16 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In other words, reoptimization admits that the query optimizer can make mistakes in decisions and thus tries to alleviate them. In a recent work, Perron et al [31] evaluated the performance of reoptimization on the Join Order Benchmark [23], and their result suggests that reoptimization can significantly reduce query execution time.…”

Section: Reoptimizationmentioning

confidence: 99%

“…Another baseline that we originally considered is reoptimization [18], which has been previously evaluated on JOB via simulation by Perron et al [31]. We implemented the reoptimization technique in PostgreSQL (we describe the details in Appendix E).…”

Section: Baselinementioning

confidence: 99%

“…Another approach is to make a better query plan with the help of queries that have been executed before [9,13,20,24,30,33,34]. However, none of these approaches are completely satisfying [31].…”

Section: Introductionmentioning

confidence: 99%

“…As it is difficult to solve the cardinality estimation problem, reoptimization [3,18,31] is proposed to avoid the need for accurate cardinality estimation. Reoptimization first generates an initial execution plan at the beginning, and then detects during runtime whether the actual behavior of a query plan becomes significantly different from what was expected.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Break Up the Pipeline Structure to Reach a Nearly Optimal End-to-End Latency

Zhao¹,

Gao²

2022

Preprint

View full text Add to dashboard Cite

Query optimization is still problematic in the commercial database system because database optimizers sometimes choose a bad execution plan with a several-fold difference in latency from the optimal one. In this paper, we design a new dynamic optimization strategy called query split, which takes advantage of runtime statistics. Integrating query split into PostgreSQL, we have a 2× speedup for total end-to-end latency on Join Order Benchmark and achieve near-optimal latency by comparing with the optimal execution plan. Our finding reveals that breaking up the static pipeline between database optimizer and executor can benefit the query processing and greatly reduce end-to-end latency.

show abstract

Section: Reoptimizationmentioning

confidence: 99%

Section: Baselinementioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Break Up the Pipeline Structure to Reach a Nearly Optimal End-to-End Latency

Zhao¹,

Gao²

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…First, the estimation accuracy does not directly equal to the query plan quality. As different sub-plan queries matters differently to the query plan [3,44,55], a more accurate method may produce a much worse query plan if they mistake a few very important estimations [40]. Second, the actual query time is affected by multiple factors, including both query plan quality and CardEst inference cost.…”

Section: Introductionmentioning

confidence: 99%

Cardinality Estimation in DBMS: A Comprehensive Benchmark Evaluation

Han,

Wu,

et al. 2021

Preprint

View full text Add to dashboard Cite

Cardinality estimation (CardEst) plays a significant role in generating high-quality query plans for a query optimizer in DBMS. In the last decade, an increasing number of advanced CardEst methods (especially ML-based) have been proposed with outstanding estimation accuracy and inference latency. However, there exists no study that systematically evaluates the quality of these methods and answer the fundamental problem: to what extent can these methods improve the performance of query optimizer in real-world settings, which is the ultimate goal of a CardEst method.In this paper, we comprehensively and systematically compare the effectiveness of CardEst methods in a real DBMS. We establish a new benchmark for CardEst, which contains a new complex real-world dataset STATS and a diverse query workload STATS-CEB. We integrate multiple most representative CardEst methods into an open-source database system PostgreSQL, and comprehensively evaluate their true effectiveness in improving query plan quality, and other important aspects affecting their applicability, ranging from inference latency, model size, and training time, to update efficiency and accuracy. We obtain a number of key findings for the CardEst methods, under different data and query settings. Furthermore, we find that the widely used estimation accuracy metric (Q-Error) cannot distinguish the importance of different sub-plan queries during query optimization and thus cannot truly reflect the query plan quality generated by CardEst methods. Therefore, we propose a new metric P-Error to evaluate the performance of CardEst methods, which overcomes the limitation of Q-Error and is able to reflect the overall end-to-end performance of CardEst methods. We have made all of the benchmark data and evaluation code publicly available at h ps://github.com/Nathaniel-Han/End-to-End-CardEst-Benchmark.

show abstract