Proceedings of 1994 IEEE 10th International Conference on Data Engineering
DOI: 10.1109/icde.1994.283001
|View full text |Cite
|
Sign up to set email alerts
|

Performing group-by before join [query processing]

Abstract: Assume that we have an SqL query containing joins and a group-by. The standard way of evaluating this type of query is t o first perform all the joins and then the group-by operation. However, it may be possible t o perform the group-by early, that is, to push the groupby operation past one or more joins. Early grouping may reduce the query processing cost by reducing the amount of data participating in joins. W e formally define the problem, adhering strictly t o the semantics of NULL and duplicate eliminatio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
45
0

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 33 publications
(45 citation statements)
references
References 5 publications
0
45
0
Order By: Relevance
“…In the general case, special difficulties arise dealing with the relational group by (basis of roll-up, OLAP key operator). Interestingly, we want to remark the gain when dealing with our restricted group by (i.e., roll-up) instead of the generic one, whose difficulty is discussed in depth in [5] (where it is explicitly said that no law is stated) and specially in [19] (where the whole work is devoted to analyze all possibilities between join and group-by).…”
Section: Normalizing the Macmentioning
confidence: 99%
“…In the general case, special difficulties arise dealing with the relational group by (basis of roll-up, OLAP key operator). Interestingly, we want to remark the gain when dealing with our restricted group by (i.e., roll-up) instead of the generic one, whose difficulty is discussed in depth in [5] (where it is explicitly said that no law is stated) and specially in [19] (where the whole work is devoted to analyze all possibilities between join and group-by).…”
Section: Normalizing the Macmentioning
confidence: 99%
“…Interestingly, it became immediately apparent that prior work on partially or totally pushing group by operations past one or more join operations (also called eager aggregation transformation) [17,3,18,7,4] could be applied to these plans to partially group and aggregate tuples that are selected from the fact table. This transformation is not possible in traditional star schemas where no information about the hierarchies is encoded in the fact table.…”
Section: Introductionmentioning
confidence: 99%
“…In this algorithm, we do not materialize the join operation as in the traditional algorithms where the join operation is evaluated first and then the group-by and aggregate functions (Yan and Larson, 1994). So the Input/Output cost is minimal because we do not need to save the huge volume of data that results from the join operation.…”
Section: Introductionmentioning
confidence: 99%
“…But the response time of these queries is significantly reduced if the group-by operation is performed before the join (Chaudhuri and Shim, 1994), because group-by reduces the size of the relations thus minimizing the join and data redistribution costs. Several algorithms that perform the group-by operation before the join operation were presented in the literature (Shatdal and Naughton, 1995;Taniar et al, 2000;Taniar and Rahayu, 2001;Yan and Larson, 1994). In the "Early Distribution Schema" algorithm presented in (Taniar and Rahayu, 2001), all the tuples of the tables are redistributed before applying the join or the group-by operations, thus the communication cost in this algorithm is very high.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation