2005
DOI: 10.1145/1113841.1113843
|View full text |Cite
|
Sign up to set email alerts
|

Software-controlled fault tolerance

Abstract: Traditional fault tolerance techniques typically utilize resources ineffectively because they cannot adapt to the changing reliability and performance demands of a system. This paper proposes software-controlled fault tolerance, a concept allowing designers and users to tailor their performance and reliability for each situation. Several software-controllable fault detection techniques are then presented: SWIFT, a software-only technique, and CRAFT, a suite of hybrid hardware/ software techniques. Finally, the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
68
0

Year Published

2011
2011
2023
2023

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 131 publications
(68 citation statements)
references
References 35 publications
0
68
0
Order By: Relevance
“…Hence, whenever any edge weight crosses the threshold value, we switch to version V 1 as we can no longer guarantee non-criticality of instructions in version V 2 due to the unexpected dependence. Note that since [20] is a software-only based fault tolerance scheme, there are certain kinds of hardware errors which cannot be detected by such a scheme -for more details, we refer the reader to [34]. While we choose to use the scheme in [20] for our experiments, it is important to note that any improvements to software-only fault tolerant schemes (such as hardware support for correct update of the PC) are orthogonal to our technique and hence can be jointly used.…”
Section: Ensuring Application-level Correctnessmentioning
confidence: 99%
“…Hence, whenever any edge weight crosses the threshold value, we switch to version V 1 as we can no longer guarantee non-criticality of instructions in version V 2 due to the unexpected dependence. Note that since [20] is a software-only based fault tolerance scheme, there are certain kinds of hardware errors which cannot be detected by such a scheme -for more details, we refer the reader to [34]. While we choose to use the scheme in [20] for our experiments, it is important to note that any improvements to software-only fault tolerant schemes (such as hardware support for correct update of the PC) are orthogonal to our technique and hence can be jointly used.…”
Section: Ensuring Application-level Correctnessmentioning
confidence: 99%
“…On Software-Controlled Fault Tolerance [8], the authors presented a set of software and hybrid (software and hardware) transient fault detection techniques. Each of the proposed techniques had a different cost/benefit relation by improving reliability or performance.…”
Section: A Executions In a Fault Injection Environmentmentioning
confidence: 99%
“…Continuing their research in fault tolerance for transient faults, the same authors of [8] proposed Spot [9], a technique to dynamically insert redundant instructions to detect errors generated by transient faults. This dynamic insertion was made in runtime using instrumentation.…”
Section: A Executions In a Fault Injection Environmentmentioning
confidence: 99%
“…Other works such as CRAFT and PROFIT [27] improve upon the SWIFT solution by leveraging additional hardware structures and architectural vulnerability factor (AVF) analysis [23], respectively. Compiler-based instruction duplication delivers nearly complete fault coverage, with the added benefit of requiring little to no hardware cost.…”
Section: Related Workmentioning
confidence: 99%
“…Although symptom-based detection is inexpensive, the amount of coverage that can be obtained from a symptom-only approach is typically limited. To address this limitation, we make use of the second area of prior research, software-based instruction duplication [26,27]. With this approach, instructions are duplicated and results are validated within a single thread of execution.…”
Section: Introductionmentioning
confidence: 99%