TAPIOCA: An I/O Library for Optimized Topology-Aware Data Aggregation on Large-Scale Supercomputers

Tessier, François; Vishwanath, Venkatram; Jeannot, Emmanuel

doi:10.1109/cluster.2017.80

Cited by 25 publications

(11 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…With an increasing number of memory storage layers and complexity of storage system interactions, the issue of I/O performance is getting imperative and significantly hinders the overall performance of applications . Research efforts have been made to relax the POSIX semantics and alleviate the I/O bottleneck from high‐level libraries (eg, HDF5, netCDF, ADIOS), I/O middleware (eg, MPI‐IO, TAPIOCA), to I/O forwarding layers . All of these provide an array‐based data model to organize the data and define data access semantics.…”

Section: Related Workmentioning

confidence: 99%

Interfacing HDF5 with a scalable object‐centric storage system on hierarchical storage

Soumagne

Byna

et al. 2020

Concurrency and Computation

View full text Add to dashboard Cite

Summary Object storage technologies that take advantage of multitier storage on HPC systems are emerging. However, to use these technologies at present, applications have to be modified significantly from current I/O libraries. HDF5, a widely used I/O middleware on HPC systems, provides a virtual object layer (VOL) that allows applications to connect to different storage mechanisms transparently without requiring significant code modifications. We recently designed the proactive data containers (PDC) object‐centric storage system that provides the capabilities of transparent, asynchronous, and autonomous data movement taking advantage of multiple storage tiers—a decision that has so far been left upon the user on most current systems. To enable PDC's features through HDF5 without modifying application codes, we have developed an HDF5 VOL connector that interfaces with PDC. We present in this article the connector interface and evaluate its performance on Cori, a Cray XC40 supercomputer located at the National Energy Research Scientific Computing Center (NERSC). Our evaluation demonstrates up to an 8× improvement compared with HDF5 that has the most recent optimizations.

show abstract

Section: Related Workmentioning

confidence: 99%

Interfacing HDF5 with a scalable object‐centric storage system on hierarchical storage

Soumagne

Byna

et al. 2020

Concurrency and Computation

View full text Add to dashboard Cite

show abstract

“…Other researchers took the routing mechanism of BG/Q into consideration when issued sparse data access [12]. Tessier et al [13] used a different approach which combines an optimized buffering system and a topology-aware aggregators mapping algorithm targeting any kind of architecture, so the new algorithm can be easily extended. Tsujita et al [14] introduced a topology-aware data aggregation scheme which takes care of processes rank layout across compute nodes and rearranges the data collection sequence during the shuffle phase in order to mitigate network contention.…”

Section: Related Workmentioning

confidence: 99%

Topology Aware Algorithm for Two-Phase I/O in Clusters With Tapered Hierarchical Networks

Liu

2020

IEEE Access

View full text Add to dashboard Cite

It is important for large scale scientific simulations to read and write data from parallel file system efficiently. To decrease the I/O bottleneck of scientific simulations, many middle-ware solutions had been developed, and the two-phase scheme is one well-known I/O algorithm designed for collective I/O operations. During two-phase I/O based operations, a subset of processes is selected to aggregate noncontiguous pieces of data in the shuffle phase before doing collective reads/writes in the I/O phase. In the meantime, the tapered hierarchical network has long been proposed in order to decrease procurement and power cost. Higher bandwidth and lower latency can be provided in the low levels of tapered hierarchical network. In this paper, we presented a new implementation of two-phase I/O algorithm which takes into consideration the communication pattern and the topology of tapered hierarchical network when scheduling the inter-process communications during the shuffle phase. We validated the new algorithm on our high performance computers and obtained the experimental data on the I/O kernels of some simulations. A significant improvement of the shuffle phase performance was achieved by our new algorithm when compared with the previous two-phase I/O implementations.

show abstract

“…The conventional parallel I/O stack consists of high-level libraries (HDF5 [23], NetCDF [24], ADIOS [25], etc. ), I/O middleware (MPI-IO [26], TAPIOCA [27]), and I/O forwarding layer [28]. Several research efforts have focused on relaxing the POSIX semantics and on defining new data models in these layers.…”

Section: Related Workmentioning

confidence: 99%

Toward Scalable and Asynchronous Object-Centric Data Management for HPC

Tang

Byna

Tessier

et al. 2018

2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)

Self Cite

View full text Add to dashboard Cite

Emerging high performance computing (HPC) systems are expected to be deployed with an unprecedented level of complexity due to a deep system memory and storage hierarchy. Efficient and scalable methods of data management and movement through this hierarchy is critical for scientific applications using exascale systems. Moving toward new paradigms for scalable I/O in the extreme-scale era, we introduce novel object-centric data abstractions and storage mechanisms that take advantage of the deep storage hierarchy, named Proactive Data Containers (PDC). In this paper, we formulate object-centric PDCs and their mappings in different levels of the storage hierarchy. PDC adopts a client-server architecture with a set of servers managing data movement across storage layers. To demonstrate the effectiveness of the proposed PDC system, we have measured performance of benchmarks and I/O kernels from scientific simulation and analysis applications using PDC programming interface, and compared the results with existing highly tuned I/O libraries. Using asynchronous I/O along with data and metadata optimizations, PDC demonstrates up to 23× speedup over HDF5 and PLFS in writing and reading data from a plasma physics simulation. PDC achieves comparable performance with HDF5 and PLFS in reading and writing data of a single timestep at small scale, and outperforms them at a scale of larger than ten thousand cores. In contrast to existing storage systems, PDC offers user-space data management with the flexibility to allocate the number of PDC servers depending on the workload.

show abstract

TAPIOCA: An I/O Library for Optimized Topology-Aware Data Aggregation on Large-Scale Supercomputers

Cited by 25 publications

References 17 publications

Interfacing HDF5 with a scalable object‐centric storage system on hierarchical storage

Interfacing HDF5 with a scalable object‐centric storage system on hierarchical storage

Topology Aware Algorithm for Two-Phase I/O in Clusters With Tapered Hierarchical Networks

Toward Scalable and Asynchronous Object-Centric Data Management for HPC

Contact Info

Product

Resources

About