2018
DOI: 10.1007/s00778-018-0514-9
|View full text |Cite
|
Sign up to set email alerts
|

A survey of state management in big data processing systems

Abstract: The concept of state and its applications vary widely across big data processing systems. This is evident in both the research literature and existing systems, such as Apache Flink, Apache Heron, Apache Samza, Apache Spark, and Apache Storm. Given the pivotal role that state management plays, particularly, for iterative batch and stream processing, in this survey, we present examples of state as an enabler, discuss the alternative approaches used to handle and implement state, capture the many facets of state … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
35
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
3
3
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 64 publications
(36 citation statements)
references
References 96 publications
(216 reference statements)
0
35
0
Order By: Relevance
“…While scaling‐in/out stateless operators can be achieved by just turning off/on operator replicas, elasticity of stateful operators requires state migration and repartitioning among the replicas, because the system needs to preserve the consistency of the operations . State management in DPS systems is nicely surveyed by To et al; in the following, we focus on the issues that are more closely related to our work.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…While scaling‐in/out stateless operators can be achieved by just turning off/on operator replicas, elasticity of stateful operators requires state migration and repartitioning among the replicas, because the system needs to preserve the consistency of the operations . State management in DPS systems is nicely surveyed by To et al; in the following, we focus on the issues that are more closely related to our work.…”
Section: Related Workmentioning
confidence: 99%
“…Madsen et al 24 28 ; in the following, we focus on the issues that are more closely related to our work.…”
Section: Related Workmentioning
confidence: 99%
“…The main distinguishing feature of the existing elasticity policies regards their being centralized, fully decentralized, or hybrid. Another characterizing aspect is related to taking into account or not the reconfiguration costs that arise after a scaling-in/out decision as well as a migration from one computing resource to another and are particularly burdensome in case of stateful operators [31].…”
Section: Elasticity Policiesmentioning
confidence: 99%
“…Deletes in LSM-trees. LSM-trees are employed as the storage layer for relational systems [28], streaming systems [2,41,65], and pure key-value storage [52,68]. As a result, an LSM delete operation may be triggered by various logical operations, not limited to user-driven deletes.…”
Section: Introductionmentioning
confidence: 99%