2014 IEEE 10th International Conference on E-Science 2014
DOI: 10.1109/escience.2014.44
|View full text |Cite
|
Sign up to set email alerts
|

Community Resources for Enabling Research in Distributed Scientific Workflows

Abstract: A significant amount of recent research in scientific workflows aims to develop new techniques, algorithms and systems that can overcome the challenges of efficient and robust execution of ever larger workflows on increasingly complex distributed infrastructures. Since the infrastructures, systems and applications are complex, and their behavior is difficult to reproduce using physical experiments, much of this research is based on simulation. However, there exists a shortage of realistic datasets and tools th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
35
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
5
2
2

Relationship

2
7

Authors

Journals

citations
Cited by 63 publications
(35 citation statements)
references
References 30 publications
0
35
0
Order By: Relevance
“…The three other datasets represent actual applications and have been generated with the Pegasus Workflow Generator [8]. We consider three different datasets, named LIGO, MONTAGE, and GENOME, each containing 20 graphs of 100 nodes.…”
Section: Simulation Resultsmentioning
confidence: 99%
“…The three other datasets represent actual applications and have been generated with the Pegasus Workflow Generator [8]. We consider three different datasets, named LIGO, MONTAGE, and GENOME, each containing 20 graphs of 100 nodes.…”
Section: Simulation Resultsmentioning
confidence: 99%
“…We collected workflow execution traces [38], [59] (including overhead and task runtime information) from real runs of the 5 scientific workflow applications previously described. The traces are used to feed the Workflow Generator [60] toolkit to create synthetic workflows. The toolkit uses statistical data gathered from traces of actual scientific workflow executions to generate realistic, synthetic workflows that resemble the real applications.…”
Section: Experiments Conditionsmentioning
confidence: 99%
“…What is more, the typical price for storing all of a workflow's files in the cloud for the duration of its execution is much smaller than the price for computing. For example, the Workflow Gallery [46] provides a sample Montage application consisting of 1000 tasks that executes in 3 hours 10 minutes and generates files with a total size of 4.2GiB. According to current Google Cloud pricing [2], it costs substantial more to rent standard VMs for the duration of computation ($0.1583 at $0.05 per hour) than to store all files for the same amount of time ($0.0001128 at $0.026 per GiB per month that consists of 730 hours).…”
Section: Storage and File Transfer Modelmentioning
confidence: 99%
“…We evaluated the algorithms using ensembles consisting of synthetic workflows from the Workflow Gallery [46]. We have selected workflows representing several different classes of applications [32].…”
Section: Evaluation Proceduresmentioning
confidence: 99%