2017
DOI: 10.1002/cpe.4334
|View full text |Cite
|
Sign up to set email alerts
|

Optimal operator deployment and replication for elastic distributed data stream processing

Abstract: Summary Processing data in a timely manner, data stream processing (DSP) applications are receiving an increasing interest for building new pervasive services. Due to the unpredictability of data sources, these applications often operate in dynamic environments; therefore, they require the ability to elastically scale in response to workload variations. In this paper, we deal with a key problem for the effective runtime management of a DSP application in geo‐distributed environments: We investigate the placeme… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
68
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 47 publications
(68 citation statements)
references
References 43 publications
0
68
0
Order By: Relevance
“…Bugra et al [1] proposed a system of auto-parallelization that dynamically adjusts the number of parallel channels to achieve the best performance based on changes in the workload. Marangozova-Martin et al [2] proposed multi-level elasticity in stream processing environments with low latency and minimum resources, and Cardellini et al [3] dealt with effective runtime management in terms of placement and replication decisions while considering the application and resource heterogeneity and the migration overhead, so to select the optimal adaptation strategy that can minimize migration costs while satisfying the application QoS requirements. These papers' objective was to achieve the elastic scalability for stream processing systems based on the individual machines or nodes.…”
Section: Related Workmentioning
confidence: 99%
“…Bugra et al [1] proposed a system of auto-parallelization that dynamically adjusts the number of parallel channels to achieve the best performance based on changes in the workload. Marangozova-Martin et al [2] proposed multi-level elasticity in stream processing environments with low latency and minimum resources, and Cardellini et al [3] dealt with effective runtime management in terms of placement and replication decisions while considering the application and resource heterogeneity and the migration overhead, so to select the optimal adaptation strategy that can minimize migration costs while satisfying the application QoS requirements. These papers' objective was to achieve the elastic scalability for stream processing systems based on the individual machines or nodes.…”
Section: Related Workmentioning
confidence: 99%
“…Most of the existing system architectures consider a centralized management solution, where a single coordination entity exploits its global knowledge about the entire system state to plan the proper adaptation actions (e.g., [6,7,[17][18][19][20][21]). Although this approach can potentially achieve a global optimum adaptation strategy, it may not be suitable for a wide-area distributed environment, because of the tight coupling among the system components and the fact that a central manager represents a bottleneck in a large-scale system due to monitoring and planning overheads.…”
Section: System Architecturesmentioning
confidence: 99%
“…Other works (e.g., [7,[35][36][37][38]) use more complex centralized policies to determine the scaling decisions, exploiting optimization methods that rely on the knowledge of a global model, such as integer linear programming [7], control theory [35], queueing theory [36], and fuzzy logic [37]. In [7], we presented an integer linear programming problem for the run-time elasticity management of DSP applications that takes into account the application reconfiguration costs after scaling operations and aims to minimize them while satisfying the application performance requirements. Lohrmann et al [36] proposed a strategy that enforces latency constraints by relying on a predictive latency model based on queueing theory.…”
Section: Elasticity Policiesmentioning
confidence: 99%
See 1 more Smart Citation
“…A DSP application is commonly structured as a directed graph whose vertices are data sources, operators, and data sinks, whereas edges represent the data streams between operators. The application has one or multiple data sources that produce an input data stream, operators that perform transformations over the streaming data (e.g., filtering, aggregation, convolution) until the data reaches a data sink [3]. DSP applications are traditionally deployed on the Cloud in order to explore its virtually unlimited number of resources.…”
Section: Introductionmentioning
confidence: 99%