2018
DOI: 10.1145/3210559
|View full text |Cite
|
Sign up to set email alerts
|

Declarative Resilience

Abstract: To protect multicores from soft-error perturbations, research has explored various resiliency schemes that provide high soft-error coverage. However, these schemes incur high performance and energy overheads. We observe that not all soft-error perturbations affect program correctness, and some soft-errors only affect program accuracy, i.e., the program completes with certain acceptable deviations from error free outcome. Thus, it is practical to improve processor efficiency by trading off resiliency overheads … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
8
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
7

Relationship

1
6

Authors

Journals

citations
Cited by 8 publications
(8 citation statements)
references
References 46 publications
0
8
0
Order By: Relevance
“…To improve the performance efficiency of resilience schemes, researchers have also explored selective resiliency mechanisms for an application, such that performance and error coverage demands are both fulfilled [18,26,29,38,40]. These schemes are also listed in Table 1 (the third block).…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…To improve the performance efficiency of resilience schemes, researchers have also explored selective resiliency mechanisms for an application, such that performance and error coverage demands are both fulfilled [18,26,29,38,40]. These schemes are also listed in Table 1 (the third block).…”
Section: Related Workmentioning
confidence: 99%
“…The temporal dual-modular redundancy scheme is shown on a multicore, where each instance of the application shares and utilizes all available hardware resources of the system. some solutions [13,26,38,40] assure improved performance by employing selective resiliency that trades off program accuracy with resilience overheads. For instance, a recently proposed work [26] exploits performance-accuracy, where it bifurcates an application into crucial/non-crucial regions and enables redundancy only for protecting crucial regions, whereas non-crucial regions are partially protected via software resiliency mechanisms.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…These works have shown good results by compensating the costs of fault tolerance techniques with the speed-up obtained through AC. However, most of the works presented are at the circuit level [10,[31][32][33][34] or require special architecture [9,29,35,36], so hardware modifications are still necessary. This means that those works are not suitable for COTS processors.…”
Section: Fault Tolerance With Approximate Computing Techniquesmentioning
confidence: 99%