2013
DOI: 10.1587/transinf.e96.d.886
|View full text |Cite|
|
Sign up to set email alerts
|

A Scalable Communication-Induced Checkpointing Algorithm for Distributed Systems

Abstract: SUMMARYCommunication-induced checkpointing (CIC) has two main advantages: first, it allows processes in a distributed computation to take asynchronous checkpoints, and secondly, it avoids the domino effect. To achieve these, CIC algorithms piggyback information on the application messages and take forced local checkpoints when they recognize potentially dangerous patterns. The main disadvantages of CIC algorithms are the amount of overhead per message and the induced storage overhead. In this paper we present … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2014
2014
2020
2020

Publication Types

Select...
2
2

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 10 publications
(19 reference statements)
0
1
0
Order By: Relevance
“…An optimized version of FINE, called LazyFINE, applies a lazy strategy using the work of Lou and Manivannan [21,22]. Finally, Simon et al [9,12,23] propose another FI variant, which addresses system scalability, aimed for large-scale systems. Simon et al reduce the number of forced checkpoints by delaying non-forced checkpoints.…”
Section: Related Workmentioning
confidence: 99%
“…An optimized version of FINE, called LazyFINE, applies a lazy strategy using the work of Lou and Manivannan [21,22]. Finally, Simon et al [9,12,23] propose another FI variant, which addresses system scalability, aimed for large-scale systems. Simon et al reduce the number of forced checkpoints by delaying non-forced checkpoints.…”
Section: Related Workmentioning
confidence: 99%