2005
DOI: 10.1002/cpe.968
|View full text |Cite
|
Sign up to set email alerts
|

Virtual data Grid middleware services for data‐intensive science

Abstract: SUMMARYThe GriPhyN virtual data system provides a suite of components and services for data-intensive sciences that enables scientists to systematically and efficiently describe, discover, and share large-scale data and computational resources. We describe the design and implementation of such middleware services in terms of a virtual data system interface called Chiron, and present virtual data integration examples from the QuarkNet education project and from functional-MRI-based neuroscience research. The Ch… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
16
0

Year Published

2006
2006
2011
2011

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 18 publications
(16 citation statements)
references
References 20 publications
0
16
0
Order By: Relevance
“…We define a Grid of environmental data sources to be a set of web services following the same contract for dynamic service registry, metadata and data request interfaces, as well as output metadata scheme and data model. This is in line with the general Grid approach towards virtualization of data, services, and interfaces (Zhao et al 2006). "Behind" the web service we can store the environmental data in a file system as binary files or images, in a relational database as rows of observations, or as another web service possibly with a different service contract (Fig.…”
Section: Background and Related Workmentioning
confidence: 65%
“…We define a Grid of environmental data sources to be a set of web services following the same contract for dynamic service registry, metadata and data request interfaces, as well as output metadata scheme and data model. This is in line with the general Grid approach towards virtualization of data, services, and interfaces (Zhao et al 2006). "Behind" the web service we can store the environmental data in a file system as binary files or images, in a relational database as rows of observations, or as another web service possibly with a different service contract (Fig.…”
Section: Background and Related Workmentioning
confidence: 65%
“…Prior to using PASOA, Pegasus recorded some documentation about the enactment of jobs in a database, the Provenance Tracking Catalog (PTC) [24]. The causal connection between data items and jobs were not captured, and the workflow refinement phase was not documented at all.…”
Section: Performance Evaluationmentioning
confidence: 99%
“…Both systems also share similar graph traversal mechanisms to build the ancestry tree for a data product, and have the same set of requirements for provenance and workpattern queries; they also face similar challenges determining the granularity at which the systems should capture provenance information, and the lifetime management of the captured information. PASS has the goal of generalizing the production of certain individual items into a workflow-like pattern, for instance, running "sort a > b" would involve the same set of operations for any file produced in such a process; where in VDS we can discover such patterns in an interactive environment such as the Chiron virtual data portal [ZW+05], as users tend to repeat the same derivation process for a set of data items. Similar comments can be made about the automated provenance recording techniques being developed by Barja and Digiampietri within the context of Microsoft's Windows Workflow Foundation [BD06].…”
Section: Related Workmentioning
confidence: 99%
“…The Virtual Data System that we have developed to implement this model [ZW+05] maintains a precise record of procedures, inputs (both data and parameter settings) to procedures, the environment in which procedures were invoked, and relevant data about how a procedure behaved (e.g., duration). Armed with this information, we can track, for any data object created within the system, a derivation history that extends back to raw input data, and thus obtain accurate and complete information about how analysis conclusions (and all intermediate results) were derived.…”
Section: Introductionmentioning
confidence: 99%