The paper presents, and discusses the rationale behind, a method for structuring complex computing systems by the use of what we term “recovery blocks”, “conversations” and “fault-tolerant interfaces”. The aim is to facilitate the provision of dependable error detection and recovery facilities which can cope with errors caused by residual design inadequacies, particularly in the system software, rather than merely the occasional malfunctioning of hardware components.
Abstract:This paper gives the main definitions relating to dependability, a generic concept including as special case such attributes as reliability, availability, safety, confidentiality, integrity, maintainability, etc. Basic definitions are given first. They are then commented upon, and supplemented by additional definitions, which address the threats to dependability (faults, errors, failures), and the attributes of dependability. The discussion on the attributes encompasses the relationship of dependability with security, survivability and trustworthiness.
This paper surveys the various problems involved in achieving very high rehability from complex computing systems, and discusses the relatmnship between system structurmg techniques and techniques of fault tolerance. Topics covered mclude: 1) protective redundancy in hardware and software; 2) the use of atomic actmns to structure the activity of a system to limit mformatmn flow; 3) error detection techniques; 4) strategies for locating and dealmg with faults and for assessing the damage they have caused; and 5) forward and backward error recovery techmques, based on the concepts of recovery line, commitment, exceptmn, and compensation. The ideas described relate to techmques used to date in systems mtended for environments in whmh high reliability is demanded Three specific systems the JPL-STAR, the Bell Laboratories ESS No. 1A processor, and the PLURIBUS are described m some detail and compared.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.