1975
DOI: 10.1145/390016.808467
|View full text |Cite
|
Sign up to set email alerts
|

System structure for software fault tolerance

Abstract: The paper presents, and discusses the rationale behind, a method for structuring complex computing systems by the use of what we term “recovery blocks”, “conversations” and “fault-tolerant interfaces”. The aim is to facilitate the provision of dependable error detection and recovery facilities which can cope with errors caused by residual design inadequacies, particularly in the system software, rather than merely the occasional malfunctioning of hardware components.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
265
0
4

Year Published

1995
1995
2014
2014

Publication Types

Select...
5
3
2

Relationship

1
9

Authors

Journals

citations
Cited by 745 publications
(269 citation statements)
references
References 3 publications
0
265
0
4
Order By: Relevance
“…When saving the state of the communication links, the Domino Effect [18] has to avoided. Because of the high dynamicity of agent systems, independent checkpointing techniques are beneficial against coordinated checkpointing algorithms.…”
Section: Discussion Of Fault Tolerance Methodsmentioning
confidence: 99%
“…When saving the state of the communication links, the Domino Effect [18] has to avoided. Because of the high dynamicity of agent systems, independent checkpointing techniques are beneficial against coordinated checkpointing algorithms.…”
Section: Discussion Of Fault Tolerance Methodsmentioning
confidence: 99%
“…Error isolation and containment by using virtual memory protection has also been studied for device drivers [4]. Multi-version techniques include recovery blocks [47] and N-version software [48].…”
Section: Related Workmentioning
confidence: 99%
“…The results produced by the processors involved in the execution of the same task are collected by the Error Management (EM) component, which selects the result to be delivered by applying an adjudication function, and either forwards it to the users or stores it in a stable storage to be used in subsequent computations. When dynamic error processing mechanisms are employed [2,10], redundant execution of an applicative task might be performed in phases, where the execution of further copies of the application is conditional on the absence of an adjudged result in the current phase, as notified by the EM; this implies information exchange between EM and the Planner. EM provides also information to another component, the Diagnosis Mechanism (DM): for each redundant task execution EM delivers to DM a notification about the processor(s) that originated disagreeing results with respect to the adjudicated output.…”
Section: System Modelmentioning
confidence: 99%