2015 44th International Conference on Parallel Processing 2015
DOI: 10.1109/icpp.2015.53
|View full text |Cite
|
Sign up to set email alerts
|

Assessing the Impact of Partial Verifications against Silent Data Corruptions

Abstract: Silent errors, or silent data corruptions, constitute a major threat on very large scale platforms. When a silent error strikes, it is not detected immediately but only after some delay, which prevents the use of pure periodic checkpointing approaches devised for fail-stop errors. Instead, checkpointing must be coupled with some verification mechanism to guarantee that corrupted data will never be written into the checkpoint file. Such a guaranteed verification mechanism typically incurs a high cost. In this p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
14
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
3
2
1

Relationship

4
2

Authors

Journals

citations
Cited by 9 publications
(14 citation statements)
references
References 27 publications
0
14
0
Order By: Relevance
“…These formulas provide first-order approximations to the length of the optimal pattern in the corresponding scenario, and are valid only if the resilience parameters satisfy C, V * µ. To the best of our knowledge, the only analysis that includes partial verifications is the recent work [11], which deals with patterns that may include one or several detector(s), but all of the same type. While most applications accept several detector types, there has been no attempt to determine which and how many of these detectors should be used.…”
Section: Rr N°8741mentioning
confidence: 99%
See 3 more Smart Citations
“…These formulas provide first-order approximations to the length of the optimal pattern in the corresponding scenario, and are valid only if the resilience parameters satisfy C, V * µ. To the best of our knowledge, the only analysis that includes partial verifications is the recent work [11], which deals with patterns that may include one or several detector(s), but all of the same type. While most applications accept several detector types, there has been no attempt to determine which and how many of these detectors should be used.…”
Section: Rr N°8741mentioning
confidence: 99%
“…All of these results assume the use of guaranteed verifications only. As already mentioned, the only analysis that includes partial verifications in the pattern is the recent work of [11].…”
Section: Optimal Strategies With Guaranteed Verificationsmentioning
confidence: 99%
See 2 more Smart Citations
“…In this section we explain how to derive the optimal pattern of interleaving checkpoints and verifications. An extended presentation of the results is available in [2,4,10].…”
Section: Patterns For Divisible Load Applicationsmentioning
confidence: 99%