Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing 2021
DOI: 10.1145/3431379.3460653
|View full text |Cite
|
Sign up to set email alerts
|

Adaptive Configuration of In Situ Lossy Compression for Cosmology Simulations via Fine-Grained Rate-Quality Modeling

Abstract: Extreme-scale cosmological simulations have been widely used by today's researchers and scientists on leadership supercomputers. A new generation of error-bounded lossy compressors has been used in workflows to reduce storage requirements and minimize the impact of throughput limitations while saving large snapshots of high-fidelity data for post-hoc analysis. In this paper, we propose to adaptively provide compression configurations to compute partitions of cosmological simulations with newly designed postana… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2021
2021
2025
2025

Publication Types

Select...
6
2

Relationship

3
5

Authors

Journals

citations
Cited by 13 publications
(6 citation statements)
references
References 30 publications
0
6
0
Order By: Relevance
“…We explain the reasons for the differences in the impact of different fault types on the post-analysis results. As aforementioned, the halo-finder algorithm searches for the halos from all the simulated data, with the following two criteria: (1) the mass of an object(s) must be greater than a threshold (e.g., 81.66 times the average mass of the whole dataset) to become a halo cell candidate [34], [35], and (2) there must be enough halo cell candidates in a certain area to form a halo. Below, for each fault type, we explain in details how each fault type potentially affects the halo-finder procedure.…”
Section: B Results For Faults Affecting Application Datamentioning
confidence: 99%
See 1 more Smart Citation
“…We explain the reasons for the differences in the impact of different fault types on the post-analysis results. As aforementioned, the halo-finder algorithm searches for the halos from all the simulated data, with the following two criteria: (1) the mass of an object(s) must be greater than a threshold (e.g., 81.66 times the average mass of the whole dataset) to become a halo cell candidate [34], [35], and (2) there must be enough halo cell candidates in a certain area to form a halo. Below, for each fault type, we explain in details how each fault type potentially affects the halo-finder procedure.…”
Section: B Results For Faults Affecting Application Datamentioning
confidence: 99%
“…We also note that the baryon density field in Nyx can be easily compressed (i.e., compression ratio ranging from tens to hundreds) [34], [35], thus the importance of metadata would be greatly raised due to its increasing portion in the whole file. And since some metadata fields are related to each other, certain faults in the metadata can be detected and corrected as aforementioned; in other words, as the metadata of HDF5 file format itself has a certain degree of redundancy (correlation), we do not choose to replicate the metadata.…”
Section: ) Correction Methodologymentioning
confidence: 99%
“…Metric 4: Similar to prior work [14], [15], [42], [32], [20], [43], [35], we plot the rate-distortion curve to compare the distortion quality with the same bit-rate, for a fair comparison between different compression approaches, taking into account diverse compression algorithms.…”
Section: Evaluation Metricsmentioning
confidence: 99%
“…3) Test Datasets: We conduct our evaluation and comparison based on eight typical 1D∼4D real-world HPC simulation datasets, including six from Scientific Data Reduction Benchmarks [34]: 1D HACC cosmology simulation [12], 2D LAMMPS (part of the EXAALT ECP project) molecular dynamics simulation [24], 3D CESM-ATM climate simulation [6], 3D Nyx cosmology simulation [31], 4D Hurricane ISABEL simulation [16], and 4D QMCPack quantum simulation [32]. They have been widely used in much prior work [37,26,27,47,46,38,40,39,20,4] and are good representatives of production-level simulation datasets. Additionally, we also evaluate two datasets that highlight our decoders' potential to be used as in-memory compressors as discussed in §I, including 3D RTM simulation data for petroleum exploration [17] and 1D GAMESS data for quantum chemistry simulation [10].…”
Section: Performance Evaluationmentioning
confidence: 99%