This paper discusses the process of building an environment where large-scale, complex, scientific analysis can be scheduled onto a heterogeneous collection of computational and storage resources. The example application is the Southern California Earthquake Center (SCEC) CyberShake project, an analysis designed to compute probabilistic seismic hazard curves for sites in the Los Angeles area. We explain which software tools were used to build to the system, describe their functionality and interactions. We show the results of running the CyberShake analysis that included over 250,000 jobs using resources available through SCEC and the TeraGrid.
In this paper we discuss several challenges associated scientific workflow design and management in distributed, heterogeneous environments. Based on our prior work with a number of scientific applications, we describe the workflow lifecycle and examine our experiences and the challenges ahead as they pertain to the user experience, planning the workflow execution and managing the execution itself.
Ensemble simulations are a promising technique for identifying the signal of atmospheric response to extra-tropical sea surface temperature variability with high statistical significance. The basic idea is to perform multiple simulations from slightly different initial conditions and then to study the average signal of the ensemble. A significant obstacle to performing such ensemble simulations is the bookkeeping required to prepare, execute, and track the progress of hundreds of different computations. We describe an ensemble simulation experiment in which the Fast Ocean Atmosphere Model was run on the U.S. TeraGrid. In this experiment, we used the GriPhyN Virtual Data System to manage our ensemble simulations and their execution on distributed resources, achieving dramatic (order-of-magnitude) reductions in turnaround time relative to previous manual experiments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.