Fourth International Conference on Autonomic Computing (ICAC'07) 2007
DOI: 10.1109/icac.2007.40
|View full text |Cite
|
Sign up to set email alerts
|

Towards Autonomic Fault Recovery in System-S

Abstract: System-S is a stream processing infrastructure which enables program fragments to be distributed and connected to form complex applications. There may be potentially tens of thousands of interdependent and heterogeneous program fragments running across thousands of nodes. While the scale and interconnection imply the need for automation to manage the program fragments, the need is intensified because the applications operate on live streaming data and thus need to be highly available. System-S has been designe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
0

Year Published

2007
2007
2011
2011

Publication Types

Select...
5
3
2

Relationship

2
8

Authors

Journals

citations
Cited by 25 publications
(17 citation statements)
references
References 13 publications
0
17
0
Order By: Relevance
“…[17], [19] are in the context of System S [2], a stream processing system developed at IBM Research. [17] studies how to provide high availability for the system component of JMN by checkpointing related job state information. It does not study high availability for jobs, which have different requirements due to load spikes and tight coupling of subjobs across machines.…”
Section: Related Workmentioning
confidence: 99%
“…[17], [19] are in the context of System S [2], a stream processing system developed at IBM Research. [17] studies how to provide high availability for the system component of JMN by checkpointing related job state information. It does not study high availability for jobs, which have different requirements due to load spikes and tight coupling of subjobs across machines.…”
Section: Related Workmentioning
confidence: 99%
“…The primary control mechanism of the JMN is the Finite State Machine (FSM) engine [14]. The FSM engine itself does not define any automata.…”
Section: Job Orchestration Overviewmentioning
confidence: 99%
“…Each System S site runs an instance of each of these system components, possibly as a distributed and fault-tolerant service [14]. Each site may belong to and be managed by a distinct organization; administrators who manage one site generally have no control over another site.…”
Section: System Smentioning
confidence: 99%