This paper presents a system supporting reuse of simulation results in multi-experiment computational studies involving independent simulations and explores the benefits of such reuse. Using a SCIRun-based defibrillator device simulation code (DefibSim) and the SimX system for computational studies, this paper demonstrates how aggressive reuse between and within computational studies can enable interactive rates for such studies on a moderate-sized 128-node processor cluster; a brute-force approach to the problem would require two thousand nodes or more on a massively parallel machine for similar performance. Key to realizing these performance improvements is exploiting optimization opportunities that present themselves at the level of the overall workflow of the study as opposed to focusing on individual simulations. Such global optimization approaches are likely to become increasingly important with the shift towards interactive and universal parallel computing.
BackgroundThe growing availability of low-cost, universal parallel computing resources, including commodity processors with 4-16 cores, GPUs with support for general-purpose computing and heterogeneous multicore chips, enables small groups of researchers or even individual researchers to have dedicated access to large amounts of compute power. This availability is expected to change usage models for parallel computing, shifting the traditional emphasis on batch calculations to interactive applications. Such change requires reconsidering how parallel software systems are structured * Supported by a collaborative NSF grant comprising CSR-0615225, CSR-0614770, and CSR-0615194. and how resources have to be managed for new interactive workflows.Our ongoing work with the SimX computational study system [27] provides a representative platform in which to investigate these issues. Recognizing that computer simulation has become an integral part of the scientific method, SimX supports a scientific exploration process that manifests itself as computational studies built out of multiple computational experiments corresponding to individual runs of simulation software. Examples of such studies range from exploration of design spaces in engineering to molecular simulations for drug design.As an example of the different considerations that come into play, the performance criterion driving the design of the SimX system is the need to provide meaningful results to the researcher at a timescale that permits the researcher to interactively drive the exploration process in studies involving tens of thousands of experiments. Satisfying this requirement requires managing resources at the level of the overall computational study instead of individual experiments. Unfortunately, with the few exceptions listed in Section 5, prior research has not examined how one might improve the performance of entire studies under high levels of parallelism. This paper describes SimX support for studylevel resource management using the specific context of Design Space Exploration (D...