2014
DOI: 10.14778/2732279.2732281
|View full text |Cite
|
Sign up to set email alerts
|

Scalable and adaptive online joins

Abstract: Scalable join processing in a parallel shared-nothing environment requires a partitioning policy that evenly distributes the processing load while minimizing the size of state maintained and number of messages communicated. Previous research proposes static partitioning schemes that require statistics beforehand. In an online or streaming environment in which no statistics about the workload are known, traditional static approaches perform poorly.This paper presents a novel parallel online dataflow join operat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
38
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 70 publications
(45 citation statements)
references
References 40 publications
0
38
0
Order By: Relevance
“…In contrast, stream processing requires to be real-time, a challenge that has drawn increasing attention from researchers [2,14,21,11]. Nevertheless, stream data cleaning approaches are still in their infancy.…”
Section: Related Workmentioning
confidence: 99%
“…In contrast, stream processing requires to be real-time, a challenge that has drawn increasing attention from researchers [2,14,21,11]. Nevertheless, stream data cleaning approaches are still in their infancy.…”
Section: Related Workmentioning
confidence: 99%
“…We refer to a set of cells (that is, the corresponding input tuples) assigned to a single machine for local processing as a region. We adhere to rectangular regions, as opposed to rectilinear or non-contiguous regions, to incur minimal storage and communication costs [9].…”
Section: Background and Preliminariesmentioning
confidence: 99%
“…The content-insensitive partitioning scheme, CI (called 1-Bucket in [4], [9]), illustrated in Figure 1b, assigns all cells (n 2 of them) to machines, regardless of the join condition. Thus, regions cover the entire join matrix.…”
Section: A Content-insensitive Partitioning Schemementioning
confidence: 99%
“…To overcome this problem, a plethora of Adaptive Query Processing (AQP) techniques have been recently proposed in the literature aiming to adapt the runtime query plan in respond to changes in the execution environment or the characteristics of the streaming data [4][5][6][7][8]. The rationale followed by these AQP techniques can be condensed into a three-phase procedure, called adaptivity loop [9].…”
Section: Main Textmentioning
confidence: 99%
“…First, the Adjust feedback function is called after a change is detected, instead of Initialize (line 24). Second, if the query plan is reoptimized, the monitoring phase calls the Initialize function of the change detection algorithm, so as the change detection algorithm to entirely forget its runtime state (lines [5][6][7][8]. Recall that after the Initialize state, a "feedback-full" algorithm must collect feedback prior to be operational.…”
Section: The Novel Monitoring Phasementioning
confidence: 99%