A communication-induced checkpointing protocol that ensures rollback-dependency trackability

Baldoni, Roberto; Hélary, Jean-Michel; Mostéfaoui, Achour; Raynal, Michel

doi:10.1109/ftcs.1997.614079

Cited by 75 publications

(83 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Communication-induced checkpointing avoids the domino-effect without requiring all checkpoints to be coordinated [12], [33], [55]. In these protocols, processes take two kinds of checkpoints, local and forced.…”

Section: Quasi-synchronous or Communication Induced Checkpointingmentioning

confidence: 99%

A Review of Fault Tolerant Checkpointing Protocols for Mobile Computing Systems

Garg¹,

Kumar²

2010

IJCA

View full text Add to dashboard Cite

A distributed system is a collection of independent entities that cooperate to solve a problem that cannot be individually solved. A mobile computing system is a distributed system where some of processes are running on mobile hosts (MHs), whose location in the network changes with time. Mobile distributed systems raise new issues such as mobility, low bandwidth of wireless channels, disconnections, limited battery power and lack of reliable stable storage on mobile nodes. This paper addresses the problem of fault tolerant computing in mobile distributed systems. The techniques described are based on checkpointing and roll back recovery.

show abstract

Section: Quasi-synchronous or Communication Induced Checkpointingmentioning

confidence: 99%

A Review of Fault Tolerant Checkpointing Protocols for Mobile Computing Systems

Garg¹,

Kumar²

2010

IJCA

View full text Add to dashboard Cite

show abstract

“…Besides these two fundamental approaches there is another approach known as communication induced check pointing approach (J. Tsai et al, 1998;R. Baldoni et al, 1997;J.…”

Section: Introductionmentioning

confidence: 99%

A Low-Overhead Non-Block Check Pointing and Recovery Approach for Mobile Computing Environment

Gupta¹,

Liu²,

Koneru³

2012

Advances and Applications in Mobile Computing

View full text Add to dashboard Cite

“…In the case of a fault, processes rollback to the last checkpointed state. Communication-induced Checkpointing: It avoids the domino-effect without requiring all checkpoints to be coordinated [2], [7], [9]. In these protocols, processes take two kinds of checkpoints, local and forced.…”

Section: Introduction 11 Definitions and Notationsmentioning

confidence: 99%

“…To recover from a failure, the system restarts its execution from a previous consistent global state saved on the stable storage during fault-free execution. In distributed systems, checkpointing can be independent, coordinated [3], [8], [11] or quasi-synchronous [2], [9]. Message Logging is also used for fault tolerance in distributed systems [14].…”

Section: Introduction 11 Definitions and Notationsmentioning

confidence: 99%

Anti-message Logging Based Coordinated Checkpointing Protocol for Deterministic Mobile Computing Systems

Kumar¹,

Khunteta²

2010

IJCA

View full text Add to dashboard Cite

A checkpoint algorithm for mobile computing systems needs to handle many new issues like: mobility, low bandwidth of wireless channels, lack of stable storage on mobile nodes, disconnections, limited battery power and high failure rate of mobile nodes. These issues make traditional checkpointing techniques unsuitable for such environments. Minimum-process coordinated checkpointing is an attractive approach to introduce fault tolerance in mobile distributed systems transparently. This approach is domino-free, requires at most two checkpoints of a process on stable storage, and forces only a minimum number of processes to checkpoint. But, it requires extra synchronization messages, blocking of the underlying computation or taking some useless checkpoints. In this paper, we propose a minimumprocess coordinated checkpointing algorithm for deterministic mobile distributed systems, where no useless checkpoints are taken, no blocking of processes takes place, and anti-messages of very few messages are logged during checkpointing. We try to reduce the loss of checkpointing effort when any process fails to take its checkpoint in coordination with others. We also address the related issues like: failures during checkpointing, disconnections, concurrent initiations of the algorithm.

show abstract

A communication-induced checkpointing protocol that ensures rollback-dependency trackability

Cited by 75 publications

References 12 publications

A Review of Fault Tolerant Checkpointing Protocols for Mobile Computing Systems

A Review of Fault Tolerant Checkpointing Protocols for Mobile Computing Systems

A Low-Overhead Non-Block Check Pointing and Recovery Approach for Mobile Computing Environment

Anti-message Logging Based Coordinated Checkpointing Protocol for Deterministic Mobile Computing Systems

Contact Info

Product

Resources

About