2017
DOI: 10.1007/s10766-017-0492-3
|View full text |Cite
|
Sign up to set email alerts
|

RedThreads: An Interface for Application-Level Fault Detection/Correction Through Adaptive Redundant Multithreading

Abstract: In the presence of accelerated fault rates, which are projected to be the norm on future exascale systems, it will become increasingly difficult for highperformance computing (HPC) applications to accomplish useful computation. Due to the fault-oblivious nature of current HPC programming paradigms and execution environments, HPC applications are insufficiently equipped to deal with errors. We believe that HPC applications should be enabled with capabilities to actively search for and correct errors in their co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
36
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 14 publications
(36 citation statements)
references
References 30 publications
0
36
0
Order By: Relevance
“…Software fault detection with recovery studies are promising. Latest approaches [18,19] presented applicationlevel fault correction models. The programming model proposed in [18] adopts n cores to execute code redundantly.…”
Section: B Redundant Muti-threading and Motivationmentioning
confidence: 99%
See 3 more Smart Citations
“…Software fault detection with recovery studies are promising. Latest approaches [18,19] presented applicationlevel fault correction models. The programming model proposed in [18] adopts n cores to execute code redundantly.…”
Section: B Redundant Muti-threading and Motivationmentioning
confidence: 99%
“…Note that memory store operations are also inside the vulnerable window because they are not protected by any way. RedThreads [19] adopts optimizations of adaptive SOR [29] and lazy fault detection [30], but it also suffers from the same vulnerable window. What's more, both [18] and [19] cannot recover other register values and control flow errors.…”
Section: B Redundant Muti-threading and Motivationmentioning
confidence: 99%
See 2 more Smart Citations
“…These software approaches introduce large size and performance overheads that have to be taken into consideration. In this sense, [13], [16] and [17] use high-level libraries such as OpenMP and PThreads to generate software redundancy, or [18], where a modification of PThreads (RedThreads) is used to generate reliability-oriented redundancy with performance overheads close to 3×. However, all of them emphasize a clear performance overhead due to the complexity introduced by the mentioned libraries.…”
Section: Introductionmentioning
confidence: 99%