Novel Proposals for FAIR, Automated, Recommendable, and Robust Workflows

Abhinit, Ishan; Adams, Emily K.; Alam, Khairul; Chase, Brian; Deelman, Ewa; Gorenstein, Lev; Hudson, Stephen D.; Islam, Tanzima; Larson, Jeffrey; Lentner, Geoffrey; Mandal, Anirban; Navarro, John-Luke; Nicolae, Bogdan; Pouchard, Line; Ross, Rob; Roy, Banani; Rynge, Mats; Serebrenik, Alexander; Vahi, Karan; Wild, Stefan M.; Xin, Yufeng; Silva, Rafael Ferreira da; Filgueira, Rosa

doi:10.1109/works56498.2022.00016

2022 IEEE/ACM Workshop on Workflows in Support of Large-Scale Science (WORKS) 2022

DOI: 10.1109/works56498.2022.00016

|View full text |Cite

Novel Proposals for FAIR, Automated, Recommendable, and Robust Workflows

Ishan Abhinit

Emily K. Adams

Khairul Alam

et al.

Abstract: Lightning talks of the Workflows in Support of Large-Scale Science (WORKS) workshop are a venue where the workflow community (researchers, developers, and users) can discuss work in progress, emerging technologies and frameworks, and training and education materials. This paper summarizes the WORKS 2022 lightning talks, which cover five broad topics: data integrity of scientific workflows; a machine learning-based recommendation system; a Python toolkit for running dynamic ensembles of simulations; a cross-pla… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2023

Publication Types

Select...

Other1

Relationship

Self Cite0

Independent1

Authors

Journals

Cited by 1 publication

References 30 publications

(3 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

Asynchronous Multi-Level Checkpointing: An Enabler of Reproducibility using Checkpoint History Analytics

Assogba,

Nicolae,

Van Dam

et al. 2023

Proceedings of the SC '23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analys

View full text Add to dashboard Cite

High-performance computing applications are increasingly integrating checkpointing libraries for reproducibility analytics. However, capturing an entire checkpoint history for reproducibility study faces the challenges of high-frequency checkpointing across thousands of processes. As a result, the runtime overhead affects application performance and intermediate results when interleaving is introduced during floating-point calculations. In this paper, we extend asynchronous multi-level checkpoint/restart to study the intermediate results generated from scientific workflows. We present an initial prototype of a framework that captures, caches, and compares checkpoint histories from different runs of a scientific application executed using identical input files. We also study the impact of our proposed approach by evaluating the reproducibility of classical molecular dynamics simulations executed using the NWChem software. Experiment results show that our proposed solution improves the checkpoint write bandwidth when capturing checkpoints for reproducibility analysis by a minimum of 30× and up to 211× compared to the default checkpointing approach in NWChem.

show abstract

Asynchronous Multi-Level Checkpointing: An Enabler of Reproducibility using Checkpoint History Analytics

Assogba,

Nicolae,

Van Dam

et al. 2023

Proceedings of the SC '23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analys

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Novel Proposals for FAIR, Automated, Recommendable, and Robust Workflows

Cited by 1 publication

References 30 publications

Asynchronous Multi-Level Checkpointing: An Enabler of Reproducibility using Checkpoint History Analytics

Asynchronous Multi-Level Checkpointing: An Enabler of Reproducibility using Checkpoint History Analytics

Contact Info

Product

Resources

About