Proceedings of the 2007 ACM/IEEE Conference on Supercomputing 2007
DOI: 10.1145/1362622.1362679
|View full text |Cite
|
Sign up to set email alerts
|

Advanced data flow support for scientific grid workflow applications

Abstract: Existing work does not provide a flexible dataset-oriented data flow mechanism to meet the complex requirements of scientific Grid workflow applications. In this paper we present a sophisticated approach to this problem by introducing a data collection concept and the corresponding collection distribution constructs, which are inspired by HPF, however applied to Grid workflow applications. Based on these constructs, more fine-grained data flows can be specified at an abstract workflow language level, such as m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2009
2009
2016
2016

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 32 publications
(13 citation statements)
references
References 18 publications
0
13
0
Order By: Relevance
“…Conventionally data is conceived as "a collection of facts from which conclusions may be drawn" [41] or "group(s) of information that represent the qualitative or quantitative attributes of a variable or set of variables" [40]. Such type of data is also called statistical data.…”
Section: Datamentioning
confidence: 99%
See 1 more Smart Citation
“…Conventionally data is conceived as "a collection of facts from which conclusions may be drawn" [41] or "group(s) of information that represent the qualitative or quantitative attributes of a variable or set of variables" [40]. Such type of data is also called statistical data.…”
Section: Datamentioning
confidence: 99%
“…Data replication is commonly used to ensure high availability, reliability, fault tolerance, and efficient access of data.  Data Collection: The data collection refers to more than one sets of data [40]. A data management system in large scale distributed environments like Grid, should provide the functionality of data collection for efficient referencing of the large data sets.…”
Section: Data Management Tasks For High Performance Environmentsmentioning
confidence: 99%
“…For instance, in [33], McClatchey et al introduce a prototype scientific workflow management system entitled CRISTAL, and the distributed scientific workflow applications that they consider are SPGs. In [41], Qin and Fahringer discuss several scientific grid workflow applications, which are all structured as SPGs: the WIEN2k workflow performs electronic structure calculations of solids using density functional theory [7], the MeteoAG workflow is a meteorology simulation application [43], and the GRASIL workflow calculates the spectral energy distribution of galaxies [44]; this latter application has actually a fork-join graph. A last example is the fMRI workflow [52], which is a cognitive neuroscience application.…”
Section: Related Workmentioning
confidence: 99%
“…For instance, many systems (e.g., [27,32,36,31,23,39,29,30,18,26]) support actors that make only small changes or updates to incoming data, passing on some or all of their input to downstream actors. Thus, if invocation a above retains within its output y some unchanged substructure s from its input x, denoted as 2 x = (s ⊕ x 0 ), y = (s ⊕ y 0 ) then s will be stored twice: once in the trace record in(x, a) (call this occurrence s x ) and once in out(a, y) (call this occurrence s y ).…”
Section: Introductionmentioning
confidence: 99%
“…Furthermore, because actors often wrap complex external applications and services, various patterns of data dependencies (e.g., see [31,32,36,7]) can arise in which not all parts of the output depend on all parts of the input. Assume, e.g., that invocation a above receives input x and produces output y as follows x = (x 1 ⊕ .…”
Section: Introductionmentioning
confidence: 99%