2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks 2014
DOI: 10.1109/dsn.2014.50
|View full text |Cite
|
Sign up to set email alerts
|

Characterizing Application Memory Error Vulnerability to Optimize Datacenter Cost via Heterogeneous-Reliability Memory

Abstract: Abstract-Memory devices represent a key component of datacenter total cost of ownership (TCO), and techniques used to reduce errors that occur on these devices increase this cost. Existing approaches to providing reliability for memory devices pessimistically treat all data as equally vulnerable to memory errors. Our key insight is that there exists a diverse spectrum of tolerance to memory errors in new data-intensive applications, and that traditional one-size-fits-all memory reliability techniques are ineff… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
157
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 140 publications
(157 citation statements)
references
References 49 publications
0
157
0
Order By: Relevance
“…On the other hand, there is the growing body of evidence showing that selective fault-tolerance support is of key-importance to decrease the resource costs while providing the required level of reliability. For instance, Luo et al [24] and Fang et al [16] find that different applications and different phases in applications (in our case tasks) exhibit different vulnerabilities. Although neither of these works state it explicitly, it follows that selective fault-tolerance is a natural fit to achieve a reasonable trade-off between costs and the required level of reliability for different applications.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…On the other hand, there is the growing body of evidence showing that selective fault-tolerance support is of key-importance to decrease the resource costs while providing the required level of reliability. For instance, Luo et al [24] and Fang et al [16] find that different applications and different phases in applications (in our case tasks) exhibit different vulnerabilities. Although neither of these works state it explicitly, it follows that selective fault-tolerance is a natural fit to achieve a reasonable trade-off between costs and the required level of reliability for different applications.…”
Section: Related Workmentioning
confidence: 99%
“…However complete task replication may be prohibitive due to the high resource cost and in fact might be excessive due to the uneven susceptibility of the different program parts to SDCs [24]. Therefore effective and efficient techniques are needed to selectively replicate tasks.…”
Section: Introductionmentioning
confidence: 99%
“…On the other hand, there is the growing body of evidence showing that selective fault-tolerance support is of keyimportance to decrease the resource costs while providing the required level of reliability. For instance, Luo et al [10] and Fang et al [7] find that different applications and different phases in applications exhibit different vulnerabilities. Although neither of these works state it explicitly, it follows that selective fault-tolerance is a natural fit to achieve a reasonable trade-off between costs and the required level of reliability for different applications.…”
Section: Related Workmentioning
confidence: 99%
“…However complete replication may be prohibitive due to the high resource cost and in fact might be excessive due to the uneven susceptibility of the different application phases to SDCs [10]. Therefore effective and efficient techniques are needed to selectively replicate tasks.…”
Section: Introductionmentioning
confidence: 99%
“…Luo et al analyzes the memory error vulnerability for data-center applications by quantifying the applications' tolerance to soft errors in memory and proposes heterogeneous memory systems [53]. Ma et al characterizes an application based on its vulnerability to soft errors in caches at di erent level of the memory hierarchy using a fault injection methodology [54].…”
Section: Vulnerability Analyses For Memory Resourcesmentioning
confidence: 99%