There is growing evidence in the scientifc computing community that parallel file systems are not sufficient for all HPC storage workloads. This realization has motivated extensive research in new storage system designs. The question of which alternative we should turn towards implies that there could be a single answer satisfying a wide range of very diverse applications.We argue that such a generic solution does not exist. Instead, custom data services should be designed and tailored to the needs of specific applications on specific hardware. Furthermore, custom data services should be designed in close collaboration with users. In this paper, we present a methodology for the rapid development of such data services. This methodology promotes the design of reusable building blocks that can be composed together efficiently through a runtime based on high-performance threading, tasking, and remote procedure calls. We illustrate the success of our methodology by showcasing three examples of data services designed from the same building blocks, yet targeting entirely different applications.
Scientific applications at exascale generate and analyze massive amounts of data. A critical requirement of these applications is the capability to access and manage this data efficiently on exascale systems. Parallel I/O, the key technology enables moving data between compute nodes and storage, faces monumental challenges from new application, memory, and storage architectures considered in the designs of exascale systems. As the storage hierarchy is expanding to include node-local persistent memory, burst buffers, etc., as well as disk-based storage, data movement among these layers must be efficient. Parallel I/O libraries of the future should be capable of handling file sizes of many terabytes and beyond. In this paper, we describe new capabilities we have developed in Hierarchical Data Format version 5 (HDF5), the most popular parallel I/O library for scientific applications. HDF5 is one of the most used libraries at the leadership computing facilities for performance parallel I/O on existing HPC systems. The state-of-the-art features we describe include: Virtual Object Layer (VOL), Data Elevator, asynchronous I/O, full-featured single-writer and multiple-reader (Full SWMR), and parallel querying. In this paper, we introduce these features, their implementations, and the performance and feature benefits to applications and other libraries.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.