2016 IEEE 32nd International Conference on Data Engineering (ICDE) 2016
DOI: 10.1109/icde.2016.7498267
|View full text |Cite
|
Sign up to set email alerts
|

Tolerating correlated failures in Massively Parallel Stream Processing Engines

Abstract: Abstract-Fault-tolerance techniques for stream processing engines can be categorized into passive and active approaches. A typical passive approach periodically checkpoints a processing task's runtime states and can recover a failed task by restoring its runtime state using its latest checkpoint. On the other hand, an active approach usually employs backup nodes to run replicated tasks. Upon failure, the active replica can take over the processing of the failed task with minimal latency. However, both approach… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 28 publications
(9 citation statements)
references
References 24 publications
0
9
0
Order By: Relevance
“…Existing work on high availability in stream processing [29] proposes active replication [9,39], passive replication [27,33], hybrid active-passive replication [26,42], or models multiple approaches and evaluates them with simulated experiments [13,29]. These approaches either constrain operator logic or support weaker than exactly-once consistency guarantees.…”
Section: High Availabilitymentioning
confidence: 99%
“…Existing work on high availability in stream processing [29] proposes active replication [9,39], passive replication [27,33], hybrid active-passive replication [26,42], or models multiple approaches and evaluates them with simulated experiments [13,29]. These approaches either constrain operator logic or support weaker than exactly-once consistency guarantees.…”
Section: High Availabilitymentioning
confidence: 99%
“…Empirical studies of high availability in stream processing [76] propose an active replication approach [26,119], a passive replication approach [66,75,92], a hybrid activepassive replication approach [71,122,145], or model multiple approaches and evaluate them with simulated experiments [40,76].…”
Section: High Availabilitymentioning
confidence: 99%
“…Cardellini et al [15], [16] formulate an optimal DSP replication and placement model, where they compute a number of replica for each task to optimally scale the application. In [17] is presented a DSP engine that implements a checkpointing system combined with a partial replication of tasks, in order to reduce the cost of the system recovery and the necessity of backup nodes.…”
Section: A Reliability and Fault Tolerancementioning
confidence: 99%