2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA) 2014
DOI: 10.1109/hpca.2014.6835966
|View full text |Cite
|
Sign up to set email alerts
|

Precision-aware soft error protection for GPUs

Abstract: With the advent of general-purpose GPU computing, it is becoming increasingly desirable to protect GPUs from soft errors. For high computation throughout, GPUs must store a significant amount of state and have many execution units. The high power and area costs of full protection from soft errors make selective protection techniques attractive. Such approaches provide maximum error coverage within a fixed area or power limit, but typically treat all errors equally. We observe that for many floating-point-inten… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 20 publications
(4 citation statements)
references
References 35 publications
(41 reference statements)
0
4
0
Order By: Relevance
“…Error Mitigation Techniques: There has been significant work on selective error mitigation techniques for CPUs [85] and GPUs [69], [72], [81]. Mittal et al [72] proposed compressing similar values in GPUs.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Error Mitigation Techniques: There has been significant work on selective error mitigation techniques for CPUs [85] and GPUs [69], [72], [81]. Mittal et al [72] proposed compressing similar values in GPUs.…”
Section: Related Workmentioning
confidence: 99%
“…Mittal et al [72] proposed compressing similar values in GPUs. Palframan et al [81] analyzed GPGPU applications and proposed architectural modifications to reduce the magnitude of errors. Unfortunately, these methods are difficult to apply to MPAs due to differences in the ISA and microarchitecture.…”
Section: Related Workmentioning
confidence: 99%
“…Palframan et al [70] present a technique for improving reliability of GPUs. They note that for several floating-point intensive GPU applications, small magnitude errors have negligible effect on results while large magnitude errors may get amplified to have a significant negative impact.…”
Section: Techniques For Gpusmentioning
confidence: 99%
“…In the past, BISR has been successfully applied in the memory blocks of processor-based systems by adding spare rows, columns, and additional controller structures to correct faults during the production phase and also during in-field execution [6,7]. Other works [8][9][10] targeted data-path units, such as the register file, and some internal components of the execution units (EUs) [11]. Similarly, some works proposed reconfiguration solutions targeting computational blocks in GPGPUs [12] or other modules in the GPGPU, such as the memories [13], and functional units [14], or combinations of both aligning the system to the specific workload requirements [2,15].…”
Section: Introductionmentioning
confidence: 99%