2018
DOI: 10.1016/j.future.2017.07.015
|View full text |Cite
|
Sign up to set email alerts
|

Cloud infrastructure provenance collection and management to reproduce scientific workflows execution

Abstract: The emergence of Cloud computing provides a new computing paradigm for scientific workflow execution. It provides dynamic, on-demand and scal-able resources that enable the processing of complex workflow-based experiments. With the ever growing size of the experimental data and increasingly complex processing workflows, the need for reproducibility has also become essential. Provenance has been thought of a mechanism to verify a workflow and to provide workflow reproducibility. One of the obstacles in reproduc… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(8 citation statements)
references
References 38 publications
0
8
0
Order By: Relevance
“…In LONI pipeline, a provenance model exists which includes detailed records of data use and file lifecycle ( Dinov et al, 2010 ), which is designed to inform data consumers what types of analyses can be and have been performed with the data in question; this tool is tightly coupled with the LONI pipeline ecosystem. The ReCAP ( Hasham et al, 2018 ) project has been developed to evaluate the resource consumption of arbitrary pipelines on the cloud and can aid in cloud-instance optimization. While this tool has potential for a large impact in designing both cost effective and scalable analyses, there is considerable overhead as it manages executions through a persistent server and workflow engine.…”
Section: Emergent Technologies In Reproducible Neurosciencementioning
confidence: 99%
See 1 more Smart Citation
“…In LONI pipeline, a provenance model exists which includes detailed records of data use and file lifecycle ( Dinov et al, 2010 ), which is designed to inform data consumers what types of analyses can be and have been performed with the data in question; this tool is tightly coupled with the LONI pipeline ecosystem. The ReCAP ( Hasham et al, 2018 ) project has been developed to evaluate the resource consumption of arbitrary pipelines on the cloud and can aid in cloud-instance optimization. While this tool has potential for a large impact in designing both cost effective and scalable analyses, there is considerable overhead as it manages executions through a persistent server and workflow engine.…”
Section: Emergent Technologies In Reproducible Neurosciencementioning
confidence: 99%
“…These systems deploy tools on HPC environments and record detailed execution information so that scientists can keep accurate records and debug their workflows. Tools such as LONI’s provenance manager (Dinov et al, 2010), Reprozip (Chirigati et al, 2016), and ReCAP (Hasham et al, 2018) capture system-level properties such as system resources consumed and files accessed, where tools supporting the Neuroimaging Data Model (NIDM) (Sochat and Nichols, 2016), a neuroimaging-specific provenance model based on W3C-PROV (Missier et al, 2013), capture information about the domain-specific transformations applied to the data of interest.…”
Section: Introductionmentioning
confidence: 99%
“…The framework aims to protect data privacy from storage servers. In Hasham et al [18], another framework is presented and cloud usage scenarios are studied for the execution of scientific workflows with cloud infrastructure in order to generate the cloud-aware provenance.…”
Section: Related Workmentioning
confidence: 99%
“…The paper by Hasham et al [10] illustrates benefits (such as dynamic, on-demand and scalable provisioning of resources) that cloud computing provides for the execution of scientific workflows. Accordingly, it presents a framework that aims to reproduces Scientific Workflow Execution using Cloud-Aware Provenance (ReCAP).…”
Section: New Developments In Cloud and Iotmentioning
confidence: 99%