Proceedings of the 5th Workshop on Fault Tolerance for HPC at eXtreme Scale 2015
DOI: 10.1145/2751504.2751512
|View full text |Cite
|
Sign up to set email alerts
|

Empirical Studies of the Soft Error Susceptibility ofSorting Algorithms to Statistical Fault Injection

Abstract: Soft errors are becoming an important issue in computing systems. Near-threshold voltage (NTV), reduced circuit sizes, high performance computing (HPC), and high altitude computing all present interesting challenges in this area. Much of the existing literature has focused on hardware techniques to mitigate and measure soft errors at the hardware level. Instead, in this paper we explore the soft error susceptibility of three common sorting algorithms at the software layer. We focus on the comparison operator a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(3 citation statements)
references
References 17 publications
0
3
0
Order By: Relevance
“…Fang et al [10] discussed a systems approach to SDCs. Other papers describe SDC-resilience for sorting algorithms [11] and matrix factorization [27], and radiation-induced SDCs in GPUs [25]. We did not find prior work related to mercurial cores in HPC.…”
Section: Related Workmentioning
confidence: 76%
See 1 more Smart Citation
“…Fang et al [10] discussed a systems approach to SDCs. Other papers describe SDC-resilience for sorting algorithms [11] and matrix factorization [27], and radiation-induced SDCs in GPUs [25]. We did not find prior work related to mercurial cores in HPC.…”
Section: Related Workmentioning
confidence: 76%
“…Perhaps compilers could detect blocks of code whose correct execution is especially critical (via programmer annotations or impact analysis), and then automatically replicate just these computations. More generally, can we extend the class of SDC-resilient algorithms beyond sorting and matrix factorization [11,27]? That prior work evaluated algorithms using fault injection, a technique that does not require access to a large fleet.…”
Section: Next Steps and Research Directionsmentioning
confidence: 99%
“…From another point of view, as HPC power is targeting applications beyond the graphics domain, such as scientific applications and stock markets, it faces the challenge of addressing the need to generate accurate results that should be free of errors, as these applications cannot tolerate the existence of errors as graphical applications [7]. Hard errors are not the only concern of the HPC community, soft errors are a concern as well [8]. In [9] a study done on the data of two large-scale sites of a set of systems showed that hardware and software errors covering a considerable large proportion of root causes of failures.…”
Section: Introductionmentioning
confidence: 99%