2019 IEEE/ACM 5th International Workshop on Data Analysis and Reduction for Big Scientific Data (DRBSD-5) 2019
DOI: 10.1109/drbsd-549595.2019.00009
|View full text |Cite
|
Sign up to set email alerts
|

Analyzing the Performance and Accuracy of Lossy Checkpointing on Sub-Iteration of NWChem

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2020
2020
2020
2020

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 25 publications
0
2
0
Order By: Relevance
“…Existing lossy compressors mainly focus on optimizing from three aspects: compression ratio (i.e., storage reduction ratio), and compression speed (a.k.a., throughput), and reconstructed data quality based on statistical metrics such as PSNR (peak signal-to-noise ratio) and SSIM (structural similarity index measure). However, only few works [20], [24], [25] have studied the impact of compression error on HPC applications and none of them have systematically studied how compression errors propagate in any HPC program. This is because unlike traditional resilience and fault tolerance community that has many fault injection tools (such as PinFI [26], LLFI [23], an TensorFI [27]) to investigate how software applications are resilient to hardware errors, the HPC community is missing an efficient fault injection tool for lossy compression errors, which can help lossy compressor developers and users to understand the compression error impact on specific HPC programs.…”
Section: Research Motivationmentioning
confidence: 99%
See 1 more Smart Citation
“…Existing lossy compressors mainly focus on optimizing from three aspects: compression ratio (i.e., storage reduction ratio), and compression speed (a.k.a., throughput), and reconstructed data quality based on statistical metrics such as PSNR (peak signal-to-noise ratio) and SSIM (structural similarity index measure). However, only few works [20], [24], [25] have studied the impact of compression error on HPC applications and none of them have systematically studied how compression errors propagate in any HPC program. This is because unlike traditional resilience and fault tolerance community that has many fault injection tools (such as PinFI [26], LLFI [23], an TensorFI [27]) to investigate how software applications are resilient to hardware errors, the HPC community is missing an efficient fault injection tool for lossy compression errors, which can help lossy compressor developers and users to understand the compression error impact on specific HPC programs.…”
Section: Research Motivationmentioning
confidence: 99%
“…In other words, the propagation of compression errors in HPC programs has not been well studied and understood. Therefore, current lossy compression methods may lead to unacceptably inaccurate results for scientific discovery [18]- [20] based on the corrupted program output.…”
Section: Introductionmentioning
confidence: 99%