Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis 2013
DOI: 10.1145/2503210.2503243
|View full text |Cite
|
Sign up to set email alerts
|

Low-power, low-storage-overhead chipkill correct via multi-line error correction

Abstract: Due to their large memory capacities, many modern servers require chipkill correct, an advanced type of memory error detection and correction, to meet their reliability requirements. However, existing chipkill-correct solutions incur high power or storage overheads, or both because they use dedicated error-correction resources per codeword to perform error correction. This requires high overhead for correction and results in high overhead for error detection. We propose a novel chipkill-correct solution, multi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
30
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
4
4
1

Relationship

2
7

Authors

Journals

citations
Cited by 47 publications
(30 citation statements)
references
References 16 publications
0
30
0
Order By: Relevance
“…In addition to these commercial chipkill-correct schemes, there are a number of academic approaches (including Virtualized ECC [37], LOT-ECC [38] and Multi-ECC [39]) that use additional storage to provide strong, low-redundancy error protection.…”
Section: Multi-tiered Ecc Approachesmentioning
confidence: 99%
See 1 more Smart Citation
“…In addition to these commercial chipkill-correct schemes, there are a number of academic approaches (including Virtualized ECC [37], LOT-ECC [38] and Multi-ECC [39]) that use additional storage to provide strong, low-redundancy error protection.…”
Section: Multi-tiered Ecc Approachesmentioning
confidence: 99%
“…Such an approach (relying on a retirement scheme for permanent errors) seems consistent with the direction that industry has been moving, as is evidenced by Intel's DDDC/DDDC+1 scheme. It has also been employed by other recent academic ECC papers that have expensive correction procedures [39]. The flexible nature of Bamboo ECC codes is well suited to RBS-based retirement, as is demonstrated in Section 5.2.…”
Section: Overheadsmentioning
confidence: 99%
“…For x4 DRAM systems, such a scheme is based on a 4-bit symbol code with 32 symbols for data and 4 symbols for ECC parity and provides single symbol correction and double symbol detection. It has to activate two ranks with 18 chips per rank per memory access resulting in high power consumption and poor timing performance [20,21]. In contrast, the proposed ECC schemes for regular and checkpoint data only activate a single x4 DRAM rank and have strong reliability due to the use of symbol-based codes that have been tailored for this application.…”
Section: Ecc Designmentioning
confidence: 99%
“…In addition, future systems may be better off decoupling detection from correction (e.g., [20]) in order to meet reliability targets.…”
Section: Case Study: Using Mb-avf For Designmentioning
confidence: 99%