Proceedings of the Fourth IEEE International Symposium on High Performance Distributed Computing
DOI: 10.1109/hpdc.1995.518702
|View full text |Cite
|
Sign up to set email alerts
|

CALYPSO: a novel software system for fault-tolerant parallel processing on distributed platforms

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
27
0
1

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 44 publications
(28 citation statements)
references
References 14 publications
0
27
0
1
Order By: Relevance
“…To clarify the description of the new protocol, we distinguish between active and dormant TMUs, i.e. TMUs that handle active objects or dormant objects respectively 5 . We now describe the protocol for both kind of TMUs.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…To clarify the description of the new protocol, we distinguish between active and dormant TMUs, i.e. TMUs that handle active objects or dormant objects respectively 5 . We now describe the protocol for both kind of TMUs.…”
Section: Methodsmentioning
confidence: 99%
“…fault-tolerant networks and system reconfiguration after a fault. There has been some though, for example, FT-Linda [4], PLinda [15], Orca [16], Calypso [5], and Fail-safe PVM [17]. These systems use a combination of well known mechanisms such as replication, transactions, message logging, or checkpoints and rollbacks to provide fault-tolerance.…”
Section: Related Workmentioning
confidence: 99%
“…In such DAGs all tasks at layer ℓ must be completed before any task at layer ℓ + 1 begins. Most previous bounds for firing-squad and eager scheduling apply only to these DAGs (see e.g., [27,4,3,2,5,25,30,28,26,7,6,8,31]. By including only critical tasks in the enabled pool, Level effectively transforms an arbitrary DAG into a synchronization-barrier DAG.…”
Section: Preliminaries: Firing-squad Scheduling With Synchronization mentioning
confidence: 99%
“…Much previous work in asynchronous parallel computing considers firing-squad and other eager-scheduling algorithms (see e.g., [2,3,4,5,7,6,8,25,26,27,28,30,31]). This prior work focuses on executing programs with full synchronization barriers, frequently PRAM programs.…”
Section: Firing-squad Schedulingmentioning
confidence: 99%
“…MILAN takes advantage of two execution techniques with strong theoretical foundations [5]-two-phase idempotent execution strategy, and eager scheduling-to provide programmers with the view of a fault-free virtual shared memory environment, even when the underlying resources may incur faults and exhibit wide variations in processing speeds. This support is exposed to the programmer in the form of several programming systems: Calypso [1] described in further detail below, Chime [15] which supports distributed execution of CC++ [3] programs, and Charlotte [2] which provides a web-based metacomputing infrastructure. In addition, the MILAN system consists of supporting infrastructure such as ResourceBroker (a system for dynamically managing the association and integration of resources into multiple parallel computations according to user-specified policies) and Knitting Factory (a toolkit for construction of distributed applications in an unpredictable metacomputing environment).…”
Section: The Milan Systemmentioning
confidence: 99%