Proceedings of the 2001 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems 2001
DOI: 10.1145/378420.378434
|View full text |Cite
|
Sign up to set email alerts
|

Analysis and implementation of software rejuvenation in cluster systems

Abstract: Several recent studies have reported the phenomenon of "software aging", one in which the state of a software system degrades with time. This may eventually lead to performance degradation of the software or crash/hang failure or both. "Software rejuvenation" is a pro-active technique aimed to prevent unexpected or unplanned outages due to aging. The basic idea is to stop the running software, clean its internal state and restart it. In this paper, we discuss software rejuvenation as applied to cluster systems… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
6
0
1

Year Published

2002
2002
2018
2018

Publication Types

Select...
5
3
2

Relationship

1
9

Authors

Journals

citations
Cited by 102 publications
(7 citation statements)
references
References 24 publications
0
6
0
1
Order By: Relevance
“…Both these studies [19,8] found that nodes which just failed are more likely to fail again in the near future. At the same time, it has also been found [18] that software related error conditions can accumulate over time, leading to system failing in the long run.…”
Section: Related Workmentioning
confidence: 99%
“…Both these studies [19,8] found that nodes which just failed are more likely to fail again in the near future. At the same time, it has also been found [18] that software related error conditions can accumulate over time, leading to system failing in the long run.…”
Section: Related Workmentioning
confidence: 99%
“…The introduced watchdog mechanism for fast software system recovery relates to the concept of software rejuvenation. Several works have investigated the phenomenon of "software aging" wherein the health of a software system degrades with time [38], [39]. These papers conclude that a mechanism which "rejuvenates" or "recovers" the software component to its stable state, would provide long-term benefits in terms of experienced system availability.…”
Section: Related Workmentioning
confidence: 99%
“…The original work [14] looked at telecommunication switches and long-running scientific programs. Another study examined rejuvenation in the context of clusters [22] and found timebased rejuvenation policies effective. However, their models assumed an increasing hazard function.…”
Section: Related Workmentioning
confidence: 99%