2018
DOI: 10.14778/3229863.3229871
|View full text |Cite
|
Sign up to set email alerts
|

F1 query

Abstract: F1 Query is a stand-alone, federated query processing platform that executes SQL queries against data stored in different filebased formats as well as different storage systems at Google (e.g., Bigtable, Spanner, Google Spreadsheets, etc.). F1 Query eliminates the need to maintain the traditional distinction between different types of data processing workloads by simultaneously supporting: (i) OLTP-style point queries that affect only a few records; (ii) low-latency OLAP querying of large amounts of data; and … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 33 publications
(9 citation statements)
references
References 50 publications
0
9
0
Order By: Relevance
“…With input and output sizes many times larger than the available memory, the cost of the necessary join lookup and fetch for the materialization depends on the cost of random I/Os [6,12]. Local NVM and SSD storage could provide efficient random reads; in our environment, however, storage is disaggregated and handled by servers separate from the ones executing the query logic [29]. The cost of an I/O is a network round trip, plus the invocation of the storage service, plus an I/O in a shared and busy disk drive.…”
Section: Top-k Execution Strategiesmentioning
confidence: 99%
See 1 more Smart Citation
“…With input and output sizes many times larger than the available memory, the cost of the necessary join lookup and fetch for the materialization depends on the cost of random I/Os [6,12]. Local NVM and SSD storage could provide efficient random reads; in our environment, however, storage is disaggregated and handled by servers separate from the ones executing the query logic [29]. The cost of an I/O is a network round trip, plus the invocation of the storage service, plus an I/O in a shared and busy disk drive.…”
Section: Top-k Execution Strategiesmentioning
confidence: 99%
“…Externally sorting the entire input is an expensive operation and results in unpleasant user experience as the execution of a top-k query exhibits a performance cliff; namely the sudden and drastic change in the execution cost when the output exceeds the memory capacity. An analysis of our production query logs showed that, on an average day, F1 Query [29] executes tens of thousands of top-k queries that resort to an external sort of the entire input. We observe that it is very common for top-k queries to use secondary storage, due to high contention for main memory resources or simply because of large requested outputs.…”
Section: Introductionmentioning
confidence: 99%
“…A popular way to avoid expensive remote join operations-already used in early parallel systems-is to co-partition tables on their join key [22,26]. Generalizations of the latter technique where co-partitioning is determined by more complex join predicates have been shown to be effective in modern systems as well [38,39,41,45].…”
Section: Co-partitioningmentioning
confidence: 99%
“…Recent database systems like Google's F1 [39,41] use hierarchical partitioning schemes to provide performance while ensuring consistency under updates. Hierarchical partitioning is a variant of the co-partitioning approach [42], introduced as predicate-based reference partitioning [45].…”
Section: Hierarchical Partitioning Schemesmentioning
confidence: 99%
See 1 more Smart Citation