2014
DOI: 10.14778/2732977.2732987
|View full text |Cite
|
Sign up to set email alerts
|

Storing and querying tree-structured records in Dremel

Abstract: In Dremel, data is stored as nested relations. The schema for a relation is a tree, all of whose nodes are attributes, and whose leaf attributes hold values. We explore filter and aggregate queries that are given in the Dremel dialect of SQL. Complications arise because of repeated attributes, i.e., attributes that are allowed to have more than one value. We focus on the common class of Dremel queries that are processed on column-stored data in a way that results in query processing time that is linear on the … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
5
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 16 publications
(5 citation statements)
references
References 11 publications
0
5
0
Order By: Relevance
“…Google's large scale analytics systems such as Spanner [8], F1 [48,46], and Dremel [40] support querying against complex objects. Dremel performs evaluation over a "semiflattened" [2] format in order to avoid the space inefficiencies caused by fully flattening data. Skew-resilience and query processing performance are not discussed in [2], which focuses on the impact on storage, while details of the queryprocessing techniques applied in the commercial systems are proprietary.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Google's large scale analytics systems such as Spanner [8], F1 [48,46], and Dremel [40] support querying against complex objects. Dremel performs evaluation over a "semiflattened" [2] format in order to avoid the space inefficiencies caused by fully flattening data. Skew-resilience and query processing performance are not discussed in [2], which focuses on the impact on storage, while details of the queryprocessing techniques applied in the commercial systems are proprietary.…”
Section: Related Workmentioning
confidence: 99%
“…Dremel performs evaluation over a "semiflattened" [2] format in order to avoid the space inefficiencies caused by fully flattening data. Skew-resilience and query processing performance are not discussed in [2], which focuses on the impact on storage, while details of the queryprocessing techniques applied in the commercial systems are proprietary. Skew-resilience in parallel processing [10,37], and methods for efficient identification of heavy keys [45] have been investigated for relational data.…”
Section: Related Workmentioning
confidence: 99%
“…Suppose we have four data sources that offer information about movies and actors participating in them and demographic information about the actor, such as telephone numbers and addresses. In many modern applications (e.g., Dremel [14,15]) all this information is collected in a single relation with many attributes rather than in many relations (as would be the case, e.g., in a star schema with a big fact table and several smaller dimension tables). It might be the case that we have many data sources that offer similar information which, therefore, are over almost the same attributes but not quite.…”
Section: Introductionmentioning
confidence: 99%
“…In many modern applications (e.g. Dremel [48], [49]) all this information is collected in a single relation with many attributes rather than in many relations (as would be the case, e.g., in a star schema with a big fact table and several smaller dimension tables). It might be the case that we have many data sources that offer similar information which, therefore, are over almost the same attributes but not quite.…”
Section: Introductionmentioning
confidence: 99%
“…The reducer keys affected by this are (1,0),(2,0), (3,0) ....(8,0) (in figures seen in parts21,30,39,48,57,66,75); first of which is affected by skew. When reducers are decreased to 27 with hash function (x mod 3) attributes are also hashed to the same reducers, among these attributes are 0,4,7,9 which contribute to some reducers affected by skew.…”
mentioning
confidence: 99%