Coupled scientific simulation workflows are composed of heterogeneous component applications that simulate different aspects of the physical phenomena being modeled and that interact and exchange significant volumes of data at runtime. As the data volumes and generation rates keep growing, the traditional disk I/O-based data movement approach becomes cost prohibitive, and workflow requires more scalable and efficient approach to support the data movement. Moreover, the cost of moving large volume of data over system interconnection network becomes dominating and significantly impacts the workflow execution time. Minimize the amount of network data movement and localize data transfers are critical for reducing such cost. To achieve this, workflow task placement should exploit data locality to the extent possible and move computation closer to data. In this paper, we investigate applying in-memory data staging and data-centric task placement to reduce the data movement cost in large-scale coupled simulation workflows. Specifically, we present a distributed data sharing and task execution framework that (1) co-locates in-memory data staging on application compute nodes to store data that needs to be shared or exchanged and (2) uses data-centric task placement to map computations onto processor cores that a large portion of the data exchanges can be performed using the intra-node shared memory. We also present the implementation of the framework and its experimental evaluation on Titan Cray XK7 petascale supercomputer.
KEYWORDScoupled simulations, data-intensive application workflows, data-centric task mapping, data staging 1 INTRODUCTION Emerging coupled simulation workflows are composed of multiple applications that interact and exchange data at runtime and have the potential to achieve higher accuracy and accelerate the data to insight process. Multi-physics multi-model simulation workflow simulates different aspects of the phenomena being modeled by coupling multiple physical models. For example, in the Community Earth System Model 1 application workflow, separate simulations are coupled to model the interaction of the earth's ocean, atmosphere, land surface, and sea ice. Meanwhile, online data analytics workflow supports analyzing raw simulation data while it is being generated. For example, online analytics workflow for combustion simulation S3D 2 extracts and streams simulation data to a number of analysis operations, eg, visualization, descriptive statistics, which execute concurrently and in parallel with the simulation.However, running coupled simulation workflows on extreme-scale computing system is non-trivial. One challenge of running coupled simulation workflow at scale is providing efficient data movement mechanism to extract data from 1 simulation, transfer, and redistribute the data to another simulation or analysis. Traditionally, workflow applications share and exchange data using distributed file systems. 3-6 But the increasing performance gap between computation and disk I/O introduces significan...