2019 IEEE 25th International Symposium on on-Line Testing and Robust System Design (IOLTS) 2019
DOI: 10.1109/iolts.2019.8854397
|View full text |Cite
|
Sign up to set email alerts
|

A Vulnerability Factor for ECC-protected Memory

Abstract: Fault injection studies and vulnerability analyses have been used to estimate the reliability of data structures in memory. We survey these metrics and look at their adequacy to describe the data stored in ECC-protected memory. We also introduce FEA, a new metric improving on the memory derating factor by ignoring a class of false errors. We measure all metrics using simulations and compare them to the outcomes of injecting errors in real runs. This in-depth study reveals that FEA provides more accurate result… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
11
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2

Relationship

2
4

Authors

Journals

citations
Cited by 10 publications
(11 citation statements)
references
References 18 publications
(31 reference statements)
0
11
0
Order By: Relevance
“…The runtime system can transparently manage GPUs [7], [76], FPGA accelerators [18], [85], multi-node clusters [20], [27], [28], heterogeneous memories [4], [63], scratchpad memories [5], NUMA [81], [82] and cache coherent NUMA [21], [23] systems. Adding hardware support, the runtime system can guide cache replacement [37], [65], cache coherence deactivation [22], cache prefetching [47], [75], cache communication mechanisms in producer-consumer task relationships [64], [66], reliability and resilience [51]- [53], value approximation [19], and DVFS to accelerate critical tasks [26].…”
Section: Task Dataflow Programming Modelsmentioning
confidence: 99%
“…The runtime system can transparently manage GPUs [7], [76], FPGA accelerators [18], [85], multi-node clusters [20], [27], [28], heterogeneous memories [4], [63], scratchpad memories [5], NUMA [81], [82] and cache coherent NUMA [21], [23] systems. Adding hardware support, the runtime system can guide cache replacement [37], [65], cache coherence deactivation [22], cache prefetching [47], [75], cache communication mechanisms in producer-consumer task relationships [64], [66], reliability and resilience [51]- [53], value approximation [19], and DVFS to accelerate critical tasks [26].…”
Section: Task Dataflow Programming Modelsmentioning
confidence: 99%
“…Furthermore, uncorrected errors in the same ECC word as accessed bits may cause crasheseven when the bits accessed are not erroneous themselves. Therefore, the most appropriate metric to quantify the vulnerability of an ECC word stored in memory is the vulnerability factor induced only by the memory accesses to this word [11,15]. While other proposals aim to build on or replace single-bit AVF with metrics based on memory access counts [10,36], all these existing approaches rely on offline profiling or simulating.…”
Section: A Memory Vulnerabilitymentioning
confidence: 99%
“…This paper focuses on the vulnerability in memory induced by memory accesses [11,15], to which we refer simply as vulnerability in the rest of this paper. For every ECC word in memory, cycles are categorised as either safe or vulnerable, depending on the next memory access to this word: cycles preceding a load are vulnerable, and cycles preceding a store are safe.…”
Section: A Memory Vulnerabilitymentioning
confidence: 99%
See 1 more Smart Citation
“…Brumar et al [21] use the runtime information about the dependencies to build a memoisation approach to approximate the computations of executions and reduce the execution time. Jaulmes et al [72][73][74] evaluate the use of runtime systems in reliability and fault tolerance.…”
Section: Holistic Performance Optimisation and Runtime-aware Architec...mentioning
confidence: 99%