2016 IEEE 12th International Conference on E-Science (E-Science) 2016
DOI: 10.1109/escience.2016.7870888
|View full text |Cite
|
Sign up to set email alerts
|

A framework for scientific workflow reproducibility in the cloud

Abstract: Workflow is a well-established means by which to capture scientific methods in an abstract graph of interrelated processing tasks. The reproducibility of scientific workflows is therefore fundamental to reproducible e-Science. However, the ability to record all the required details so as to make a workflow fully reproducible is a long-standing problem that is very difficult to solve. In this paper, we introduce an approach that integrates system description, source control, container management and automatic d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
17
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
6
2
2

Relationship

0
10

Authors

Journals

citations
Cited by 29 publications
(17 citation statements)
references
References 24 publications
0
17
0
Order By: Relevance
“…Such platforms include Nextflow [17], Bwb [18] and Pachyderm [19]. While containerization addresses many of the issues outlined above and has facilitated the execution of generic workflows in a languageand cloud-agnostic manner [20,21], IaaS services still require users to deploy and manage clusters. Resource management tools like Docker Swarm and Kubernetes are mature and widely used technologies that help manage container orchestration and even support auto-scaling of resources, but they still require an installation and configuration process that may be cumbersome, or in the case of managed solutions, expensive.…”
Section: Background and Related Workmentioning
confidence: 99%
“…Such platforms include Nextflow [17], Bwb [18] and Pachyderm [19]. While containerization addresses many of the issues outlined above and has facilitated the execution of generic workflows in a languageand cloud-agnostic manner [20,21], IaaS services still require users to deploy and manage clusters. Resource management tools like Docker Swarm and Kubernetes are mature and widely used technologies that help manage container orchestration and even support auto-scaling of resources, but they still require an installation and configuration process that may be cumbersome, or in the case of managed solutions, expensive.…”
Section: Background and Related Workmentioning
confidence: 99%
“…TOSCA [15] is an OASIS standard to describe the topology of cloudbased applications towards portable, reproducible application deployments. Qasha et al [21] combine two execution-environment reproducibility techniques (i.e., the logical and physical preservation) of scientific workflows using TOSCA in a container-based approach. In addition to the plain reproducibility concerns, our middleware architecture employs reflection concepts to reconfigure deployment plans, resulting in efficient execution environments.…”
Section: Execution Environment Reproducibilitymentioning
confidence: 99%
“…The former method is used in RO-Manager [16], a tool that uses the RO-Bundle specification [6]. A more recent approach relies on user action to create the topology, relationship, and node specifications based on a standard [17] that are eventually translated to a container [18]. In this paper, we focus on automatically creating research objects using AV.…”
Section: Related Workmentioning
confidence: 99%