2018
DOI: 10.14778/3236187.3236195
|View full text |Cite
|
Sign up to set email alerts
|

RHEEM: enabling cross-platform data processing

Abstract: Solving business problems increasingly requires going beyond the limits of a single data processing platform (platform for short), such as Hadoop or a DBMS. As a result, organizations typically perform tedious and costly tasks to juggle their code and data across different platforms. Addressing this pain and achieving automatic cross-platform data processing is quite challenging: finding the most efficient platform for a given task requires quite good expertise for all the available platforms. We present Rheem… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
16
0
1

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 41 publications
(17 citation statements)
references
References 37 publications
0
16
0
1
Order By: Relevance
“…Contributions We delve into the cross-platform optimizer of Rheem [3,4,47], our open-source cross-platform system [55]. While we present the system design of Rheem in [4] and briefly discuss the data movement aspect in [43], in this paper, we describe in detail how our cost-based cross-platform optimizer tackles all of the above research challenges. 2 The idea is to split a single task into multiple atomic operators and to find the most suitable platform for each operator (or set of operators) so that its total cost is minimized.…”
Section: Current Practicementioning
confidence: 99%
See 4 more Smart Citations
“…Contributions We delve into the cross-platform optimizer of Rheem [3,4,47], our open-source cross-platform system [55]. While we present the system design of Rheem in [4] and briefly discuss the data movement aspect in [43], in this paper, we describe in detail how our cost-based cross-platform optimizer tackles all of the above research challenges. 2 The idea is to split a single task into multiple atomic operators and to find the most suitable platform for each operator (or set of operators) so that its total cost is minimized.…”
Section: Current Practicementioning
confidence: 99%
“…6). (4) We explain how we exploit our optimization pipeline for performing progressive optimization to deal with poor cardinality estimates (Sect. 7).…”
Section: Current Practicementioning
confidence: 99%
See 3 more Smart Citations