2015
DOI: 10.1145/2842620
|View full text |Cite
|
Sign up to set email alerts
|

FluidCheck

Abstract: Soft errors have become a serious cause of concern with reducing feature sizes. The ability to accommodate complex, Simultaneous Multithreading (SMT) cores on a single chip presents a unique opportunity to achieve reliable execution, safe from soft errors, with low performance penalties. In this context, we present FluidCheck, a checker architecture that allows highly flexible assignment and migration of checking duties across cores. In this article, we present a mechanism to dynamically use the resources of S… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(11 citation statements)
references
References 42 publications
0
11
0
Order By: Relevance
“…Prior works [35,45] have also explored symptombased soft-error detection/recovery mechanisms, but they provide low soft-error coverage, since they rely on coarse-grain detectors, such as fatal-traps, hangs, panics, and so on. Under hardwarebased resilience schemes [35,39,45], the solutions enable redundancy mechanisms, such as TLR [4,11,17,27] or nMR [41,44] to provide soft-error protection. For instance, prior work [44] focuses on applying DMR on a multicore (GPU) setting, where it redundantly executes two copies of the same application, and delivers high soft-error coverage by performing cross checks in a duplicated thread.…”
Section: Related Workmentioning
confidence: 99%
See 4 more Smart Citations
“…Prior works [35,45] have also explored symptombased soft-error detection/recovery mechanisms, but they provide low soft-error coverage, since they rely on coarse-grain detectors, such as fatal-traps, hangs, panics, and so on. Under hardwarebased resilience schemes [35,39,45], the solutions enable redundancy mechanisms, such as TLR [4,11,17,27] or nMR [41,44] to provide soft-error protection. For instance, prior work [44] focuses on applying DMR on a multicore (GPU) setting, where it redundantly executes two copies of the same application, and delivers high soft-error coverage by performing cross checks in a duplicated thread.…”
Section: Related Workmentioning
confidence: 99%
“…Moreover, HaRE requires hardware modifications to the cache coherence protocol for protection against soft-error lockups, as each core needs protected access to data and synchronization variables. Similarly to HaRE, another recent work, FluidCheck [11] proposes a resiliency scheme that relies on temporal redundant execution. However, FluidCheck is also expected to suffer from cache coherence protocol hangs/lock-ups due to a soft-error strike, essentially suffering from low system availability.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations