2016
DOI: 10.1177/1094342016649766
|View full text |Cite
|
Sign up to set email alerts
|

dispel4py: A Python framework for data-intensive scientific computing

Abstract: This paper presents dispel4py, a new Python framework for describing abstract stream-based workflows for distributed data-intensive applications. These combine the familiarity of Python programming with the scalability of workflows. Data streaming is used to gain performance, rapid prototyping and applicability to live observations. dispel4py enables scientists to focus on their scientific goals, avoiding distracting details and retaining flexibility over the computing infrastructure they use. The implementati… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
22
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
4
1
1

Relationship

3
3

Authors

Journals

citations
Cited by 23 publications
(23 citation statements)
references
References 56 publications
0
22
0
Order By: Relevance
“…In spite of impressive achievements to date, constructing workflows and orchestrating their executions on heterogeneous systems remain a fundamental challenge. As a result, many workflow management systems (WMSs) have been developed to fulfill specific requirements of different scientific communities [1, 8,13,17,21,22,25,34]. Although these systems have a common goal, they often do not share all capabilities across different e-Infrastructures.…”
Section: Scientific Workflowsmentioning
confidence: 99%
See 2 more Smart Citations
“…In spite of impressive achievements to date, constructing workflows and orchestrating their executions on heterogeneous systems remain a fundamental challenge. As a result, many workflow management systems (WMSs) have been developed to fulfill specific requirements of different scientific communities [1, 8,13,17,21,22,25,34]. Although these systems have a common goal, they often do not share all capabilities across different e-Infrastructures.…”
Section: Scientific Workflowsmentioning
confidence: 99%
“…On the other hand, the growing volumes of scientific data, the increased focus on data-driven science and the achievable storage density doubling every 14 months (Kryder's Law [33]), severely stresses the available disk I/Oor more generally the bandwidth between RAM and external devices. This is driving increased adoption of streambased [17,25] for implementing data-intensive applications, as these avoid a write out to disk followed by reading in, or double that I/O load if files have to be moved. Significantly reducing the cost of data movement between stages makes it economic to compose very simple stages, e.g.…”
Section: Scientific Workflowsmentioning
confidence: 99%
See 1 more Smart Citation
“…• dispel4py [4] implements many of the original Dispel concepts, but presents them as Python constructs. It describes abstract workflows for data-intensive applications, which are later automatically translated to the selected enactment platforms (e.g., Apache Storm, MPI, Multiprocessing, etc.)…”
Section: Classification Of Workflow Management Systemsmentioning
confidence: 99%
“…In recent years, numerous workflow management systems (WMSs) have been developed to manage the execution of diverse workflows on heterogeneous computing resources [3,4,5,6,7,8,9]. As user communities adopt and evolve WMSs to fit their own needs, many of the features and capabilities that were once common to most WMSs have become too distinct to share across systems.…”
Section: Introductionmentioning
confidence: 99%