The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets

Chervenak, Ann; Foster, Ian; Kesselman, Carl; Salisbury, C.; Tuecke, Steven

doi:10.1006/jnca.2000.0110

Cited by 851 publications

(407 citation statements)

References 14 publications

Supporting

Mentioning

402

Contrasting

Unclassified

Order By: Relevance

“…Ann Chernevak et al [2] introduced the "Data Grid" in an architecture paper which defines a specialized Grid architecture for handling large data volumes. The architecture is loosely defined to accommodate various models of operation but is tightly integrated with "Grid dynamics": security, awareness of virtual organizations and access to fast-changing large sets of resources.…”

Section: Data Gridsmentioning

confidence: 99%

Managing Very-Large Distributed Datasets

Branco

Zaluska

Roure

et al. 2008

On the Move to Meaningful Internet Systems: OTM 2008

View full text Add to dashboard Cite

Abstract. In this paper, we introduce a system for handling very large datasets, which need to be stored across multiple computing sites. Data distribution introduces complex management issues, particularly as computing sites may make use of different storage systems with different internal organizations. The motivation for our work is the ATLAS Experiment for the Large Hadron Collider (LHC) at CERN, where the authors are involved in developing the data management middleware. This middleware, called DQ2, is charged with shipping petabytes of data every month to research centers and universities worldwide and has achieved aggregate throughputs in excess of 1.5 Gbytes/sec over the wide-area network. We describe DQ2's design and implementation, which builds upon previous work on distributed file systems, peer-to-peer systems and Data Grids. We discuss its fault tolerance and scalability properties and briefly describe results from its daily usage for the ATLAS Experiment.

show abstract

Section: Data Gridsmentioning

confidence: 99%

Managing Very-Large Distributed Datasets

Branco

Zaluska

Roure

et al. 2008

On the Move to Meaningful Internet Systems: OTM 2008

View full text Add to dashboard Cite

show abstract

“…An early version of a replica management framework is presented in [4] where the terms Replica Management and Replica Selection are defined within the context of a Data Grid. In the European Data Grid Project a similar replica management framework was implemented [7].…”

Section: Related Workmentioning

confidence: 99%

RRS: Replica Registration Service for Data Grids

Shoshani

Sim

Stockinger

2006

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. Over the last few years various scientific experiments and Grid projects have developed different catalogs for keeping track of their data files. Some projects use specialized file catalogs, others use distributed replica catalogs to reference files at different locations. Due to this diversity of catalogs, it is very hard to manage files across Grid projects, or to replace one catalog with another. In this paper we introduce a new Grid service called the Replica Registration Service (RRS). It can be thought of as an abstraction of the concepts for registering files and their replicas. In addition to traditional single file registration operations, the RRS supports collective file registration requests and keeps persistent registration queues. This approach is of particular importance for large-scale usage where thousands of files are copied and registered. Moreover, the RRS supports a set of error directives that are triggered in case of registration failures. Our goal is to provide a single uniform interface for various file catalogs to support the registration of files across multiple Grid projects, and to make Grid clients oblivious to the specific catalog used.

show abstract

“…Almost since its beginnings, grid computing -sharing of resources of potentially heterogeneous clusters of standard computers across organizations -has sought to address the challenges intrinsic to large data sets as (Chervenak, Foster, Kesselman, Salisbury, & Tuecke, 2000). In this paper, we discuss the applications of grid technologies in the medical field and also highlight links and differences with cloud computing.…”

Section: Introductionmentioning

confidence: 99%

A review of medical grids and their direction - A Swiss/Japanese perspective

Nakai¹,

Müller²,

Bagarinao³

et al. 2015

IJRSC

View full text Add to dashboard Cite

This paper presents a general and comparative review of the advances of grid technologies in medical sciences since 2000, with a special emphasis on Europe and Japan. The EU has over the years funded several big projects in the Grid and now the cloud area, covering besides the high energy physics domain also bioinformatics and medical applications as in image analysis. Bioinformatics and pharmaceutical design have been the major targets of grid applications in Japan. Grids and clouds share many of the same goals and techniques, although clouds are more commercially oriented and privately provided services and grids are rather publicly initiated resource sharing platforms between institutions without strong economic objectives. In the future, merging these two may potentially solve inter-operation problems of grids, which have been limiting the propagation of biomedical grids.

show abstract

The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets

Cited by 851 publications

References 14 publications

Managing Very-Large Distributed Datasets

Managing Very-Large Distributed Datasets

RRS: Replica Registration Service for Data Grids

A review of medical grids and their direction - A Swiss/Japanese perspective

Contact Info

Product

Resources

About