1997
DOI: 10.1006/jpdc.1997.1338
|View full text |Cite
|
Sign up to set email alerts
|

Application Level Fault Tolerance in Heterogeneous Networks of Workstations

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
30
0

Year Published

1999
1999
2011
2011

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 88 publications
(30 citation statements)
references
References 17 publications
0
30
0
Order By: Relevance
“…Semi-transparent approaches [8,42] provide APIs to specify when and what to checkpoint. Transparent approaches [8,28] use preprocessors to produce checkpointcapable code. SGuard is more transparent than semitransparent approaches and more fine-gained than static or runtime analysis ones.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Semi-transparent approaches [8,42] provide APIs to specify when and what to checkpoint. Transparent approaches [8,28] use preprocessors to produce checkpointcapable code. SGuard is more transparent than semitransparent approaches and more fine-gained than static or runtime analysis ones.…”
Section: Related Workmentioning
confidence: 99%
“…SGuard is more transparent than semitransparent approaches and more fine-gained than static or runtime analysis ones. Additionally, unlike SGuard, some of the above techniques do not provide concurrent checkpointing [8] and many require expensive state serialization [8]. The C 3 [10] application-level checkpointing system is most similar to SGuard in that it also uses a custom memory manager.…”
Section: Related Workmentioning
confidence: 99%
“…Well-known examples are the Isis [6] and Horus [54] systems at Cornell University, the Totem system at the University of California, Santa Barbara [1], [40], and the Transis system at the Hebrew University of Jerusalem [19], [20]. Projects focusing on fault tolerance through process replication and rollback-recovery include the Manetho project at Rice University [25] and the DOME project at Carnegie Mellon University [4], [18]. RAID techniques are widely used for performance and reliability in storage systems [17].…”
Section: Related Workmentioning
confidence: 99%
“…In particular, checkpointing strategies may take advantage of the replication and redundancy inherent in shared memory systems to achieve better performance. This has been explored by researchers for transparent runtime libraries that implement distributed shared memory [21][22][23][24], and for programs that make explicit use of data structures with shared memory semantics [25,26].…”
Section: Distributed Shared Memory and Other Programming Paradigmsmentioning
confidence: 99%