ACM/IEEE SC 2005 Conference (SC'05)
DOI: 10.1109/sc.2005.76
|View full text |Cite
|
Sign up to set email alerts
|

Transparent, Incremental Checkpointing at Kernel Level: a Foundation for Fault Tolerance for Parallel Computers

Abstract: We describe the software architecture, technical features, and performance of TICK (Transparent Incremental Checkpointer at Kernel level), a system-level checkpointer implemented as a kernel thread, specifically designed to provide fault tolerance in Linux clusters. This implementation, based on the 2.6.11 Linux kernel, provides the essential functionality for transparent, highly responsive, and efficient fault tolerance based on full or incremental checkpointing at system level. TICK is completely user-transp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
78
0

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 117 publications
(78 citation statements)
references
References 23 publications
0
78
0
Order By: Relevance
“…Only these pages are included in the checkpoint [25] [26]. The disadvantage of this approach is that it requires kernel level support.…”
Section: Related Workmentioning
confidence: 99%
“…Only these pages are included in the checkpoint [25] [26]. The disadvantage of this approach is that it requires kernel level support.…”
Section: Related Workmentioning
confidence: 99%
“…Researchers propose increment checkpoint techniques to reduce checkpoint overhead [2]- [9]. At present, there are two main techniques for incremental checkpoint.…”
Section: Checkpoint-restart Techniquesmentioning
confidence: 99%
“…These benchmarks are widely used to test the checkpoint system performance [2], [6]. In the whole experiment, checkpoints are triggered by a timer interrupt at regular intervals.…”
Section: Experiments Setupmentioning
confidence: 99%
See 1 more Smart Citation
“…Two fundamentally different approaches may be employed, namely page protection mechanisms or page-table dirty bits. Different implementation variants build on these schemes, such as the bookkeeping and saving scheme that, based on the dirty bit scheme, copies pages into a buffer [19].…”
Section: Memory Managementmentioning
confidence: 99%