In this paper, we present a new algorithm for estimating the size of equality join of multiple database tables. The proposed algorithm, Correlated Sampling, constructs a small space synopsis for each table, which can then be used to provide a quick estimate of the join size of this table with other tables subject to dynamically specified predicate filter conditions, possibly specified over multiple columns (attributes) of each table. This algorithm makes a single pass over the data and is thus suitable for streaming scenarios. We compare this algorithm analytically to two other previously known sampling approaches (independent Bernoulli Sampling and End-Biased Sampling) and to a novel sketch-based approach. We also compare these four algorithms experimentally and show that results fully correspond to our analytical predictions based on derived expressions for the estimator variances, with Correlated Sampling giving the best estimates in a large range of situations.
This paper describes enhanced subquery optimizations in Oracle relational database system. It discusses several techniques -- subquery coalescing, subquery removal using window functions, and view elimination for group-by queries. These techniques recognize and remove redundancies in query structures and convert queries into potentially more optimal forms. The paper also discusses novel parallel execution techniques, which have general applicability and are used to improve the scalability of queries that have undergone some of these transformations. It describes a new variant of antijoin for optimizing subqueries involved in the universal quantifier with columns that may have nulls. It then presents performance results of these optimizations, which show significant execution time improvements.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.