2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) 2010
DOI: 10.1109/iccad.2010.5653788
|View full text |Cite
|
Sign up to set email alerts
|

Application-Aware diagnosis of runtime hardware faults

Abstract: Extreme technology scaling in silicon devices drastically affects reliability, particularly because of runtime failures induced by transistor wearout. Current online testing mechanisms focus on testing all components in a microprocessor, including hardware that has not been exercised, and thus have high performance penalties.We propose a hybrid hardware/software online testing solution where components that are heavily utilized by the software application are tested more thoroughly and frequently. Thus, our on… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
20
0

Year Published

2011
2011
2024
2024

Publication Types

Select...
4
2

Relationship

3
3

Authors

Journals

citations
Cited by 13 publications
(20 citation statements)
references
References 14 publications
0
20
0
Order By: Relevance
“…Here, hardware components are tested at regular intervals in order to detect faults such as stuck-at, bridge or pathdelay [30]- [32]. In these systems, execution is partitioned into computational epochs, and workload execution is periodically suspended to test the underlying hardware components.…”
Section: A Hardware Failures Addressedmentioning
confidence: 99%
See 2 more Smart Citations
“…Here, hardware components are tested at regular intervals in order to detect faults such as stuck-at, bridge or pathdelay [30]- [32]. In these systems, execution is partitioned into computational epochs, and workload execution is periodically suspended to test the underlying hardware components.…”
Section: A Hardware Failures Addressedmentioning
confidence: 99%
“…Cardio adopts this second execution paradigm for reliable computing, since it has been shown to be both very economical, as low as 1% hardware overhead [32], and effective in achieving high fault coverage [30]. Hence, this paper targets runtime hardware failures that can be detected by such periodic checks.…”
Section: A Hardware Failures Addressedmentioning
confidence: 99%
See 1 more Smart Citation
“…As Viper's goal is to maximize processor availability in the face of hardware faults, we assume that other mechanisms will detect faulty hardware components [17,23,31]. In our failure model, we assume that a hardware component detected as faulty can be disabled.…”
Section: Runtime Failuresmentioning
confidence: 99%
“…Recent research on reliable processors has focused on online tests [23], fault isolation [12], redundant functional units [31] and runtime checks [1,20]. Though these solutions improve reliability, they still rely on centralized control logic: a single point of failure that can allow one fault to disable an entire core.…”
Section: Introductionmentioning
confidence: 99%