2016 IEEE/ACM 20th International Symposium on Distributed Simulation and Real Time Applications (DS-RT) 2016
DOI: 10.1109/ds-rt.2016.27
|View full text |Cite
|
Sign up to set email alerts
|

FlipSphere: A Software-Based DRAM Error Detection and Correction Library for HPC

Abstract: Proposed exascale systems will present considerable resiliency challenges. In particular, DRAM soft-errors, or bit-flips, are expected to greatly increase due to much higher memory density of these systems. Current hardware-based fault-tolerance methods cannot cope by itself with the expected soft error frequency rate. As a result, additional software is needed to address this challenge. We introduce FlipSphere, a tunable, transparent silent data corruption detection and correction library for HPC applications… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
references
References 16 publications
0
0
0
Order By: Relevance