2016
DOI: 10.1002/2015ea000139
|View full text |Cite
|
Sign up to set email alerts
|

Server‐side workflow execution using data grid technology for reproducible analyses of data‐intensive hydrologic systems

Abstract: Many geoscience disciplines utilize complex computational models for advancing understanding and sustainable management of Earth systems. Executing such models and their associated data preprocessing and postprocessing routines can be challenging for a number of reasons including (1) accessing and preprocessing the large volume and variety of data required by the model, (2) postprocessing large data collections generated by the model, and (3) orchestrating data processing tools, each with unique software depen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
20
0

Year Published

2016
2016
2020
2020

Publication Types

Select...
6
1

Relationship

2
5

Authors

Journals

citations
Cited by 14 publications
(20 citation statements)
references
References 28 publications
0
20
0
Order By: Relevance
“…In a research article, for example, Essawy et al . [] present a reproducible data‐intensive impact study of drought on two U.S. States that is run remotely on a server and that uses a workflow framework that automatically records data products and tracks provenance. The research article of Yu et al .…”
Section: Content Of the Special Issuementioning
confidence: 99%
“…In a research article, for example, Essawy et al . [] present a reproducible data‐intensive impact study of drought on two U.S. States that is run remotely on a server and that uses a workflow framework that automatically records data products and tracks provenance. The research article of Yu et al .…”
Section: Content Of the Special Issuementioning
confidence: 99%
“…While we have created a plausible data management and scientific workflow system to tackle the issues of discovery, assembly, and data transformations/interpolations required to provide the input data for identifying the locally applicable LF models, we note that there is a need to automate our current approaches to speed up these data delivery and processing activities. We are currently working with computer scientists to develop a server-side infection data processing system based on using data warehouse principles and methods [27,30,31] to address this issue. A similar requirement for running dataintensive models across a large heterogeneous spatial domain is looking at advances in software and hardware to speed up the computational discovery and simulation process.…”
Section: Discussionmentioning
confidence: 99%
“…Learning parasite transmission models that take a fuller account of heterogeneous dynamics across a spatial domain is a difficult task, but the increasing availability of geolocated demographic, intervention, and disease data [23][24][25][26] together with growing advances made in computational science approaches to knowledge discovery, particularly in the areas of (1) high performance grid-based computing and programming [8,11], (2) data discovery, integration, and assembly [11,19,[27][28][29][30][31], and (3) datadriven approaches for inferring models from measurements [32][33][34][35][36][37][38], mean that simulating disease dynamics and responses to interventions effectively across heterogeneous spatially structured environments at large scales are now becoming increasingly feasible. Bayesian data-driven modeling frameworks have received considerable attention in this regard given their ability for not only facilitating the induction of a dynamical system from data, but also in the use of multiple data sources for constraining the parameters of a model to capture the local transmission features of a spatial setting [21,22,33,[39][40][41].…”
Section: Introductionmentioning
confidence: 99%
“…From a pragmatic perspective, this is an inefficient use of a scientist's or engineer's time. Perhaps more importantly, it inhibits the ability to reproduce or reuse studies that have a significant computational modeling component Essawy et al, 2016;Gil et al, 2016). One way to begin to address these challenges is through better approaches for sharing and reusing models built by others.…”
Section: Chapter 1: Introductionmentioning
confidence: 99%
“…From a pragmatic perspective, it is an inefficient use of scientists' time. Perhaps more importantly, it inhibits scientists' ability to reproduce studies that have a significant computational modeling component Essawy et al, 2016;Gil et al, 2016).…”
Section: Introductionmentioning
confidence: 99%