Remote Memory Access Programming in MPI-3

Hoefler, Torsten; Dinan, James; Thakur, Rajeev; Barrett, Brian; Balaji, Pavan; Gropp, William; Underwood, Keith D.

doi:10.1145/2780584

Cited by 81 publications

(45 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…When using passive synchronization, MPI Win flush is used to ensure that all outstanding RMA operations initiated by the calling process have been executed without the need to release the lock. After the flush call, the buffers provided to previous MPI Put and MPI Get operations can be reused or read [21]. …”

Section: Mpi One-sided Operationsmentioning

confidence: 99%

Distributed join algorithms on thousands of cores

et al. 2017

Self Cite

View full text Add to dashboard Cite

Traditional database operators such as joins are relevant not only in the context of database engines but also as a building block in many computational and machine learning algorithms. With the advent of big data, there is an increasing demand for efficient join algorithms that can scale with the input data size and the available hardware resources.In this paper, we explore the implementation of distributed join algorithms in systems with several thousand cores connected by a low-latency network as used in high performance computing systems or data centers. We compare radix hash join to sort-merge join algorithms and discuss their implementation at this scale. In the paper, we explain how to use MPI to implement joins, show the impact and advantages of RDMA, discuss the importance of network scheduling, and study the relative performance of sorting vs. hashing. The experimental results show that the algorithms we present scale well with the number of cores, reaching a throughput of 48.7 billion input tuples per second on 4,096 cores.

show abstract

Section: Mpi One-sided Operationsmentioning

confidence: 99%

Distributed join algorithms on thousands of cores

et al. 2017

Self Cite

View full text Add to dashboard Cite

show abstract

“…To leverage the capabilities of hardware enabled Remote Direct Memory Access (RDMA), we have implemented the code version that uses MPI-3 One-sided communication [20] for transferring particles between processes. One-sided communication enables direct access to the remote buffers and reduces the communication overhead by avoiding message matching and complex communication protocols.…”

Section: One-sided Communicationmentioning

confidence: 99%

Extreme Scale Plasma Turbulence Simulations on Top Supercomputers Worldwide

Tang

Wang

Ethier

et al. 2016

SC16: International Conference for High Performance Computing, Networking, Storage and Analysis

Self Cite

View full text Add to dashboard Cite

“…In RMA, unlike in MP, this condition can be easily satisfied because each process can drain the network with a local flush (enforcing consistency at any point is legal [22] …”

Section: Rma Vs Mp: Coordinated Checkpointingmentioning

confidence: 99%

Fault tolerance for remote memory access programming models

Besta

Hoefler

2014

Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing

Self Cite

View full text Add to dashboard Cite

Remote Memory Access (RMA) is an emerging mechanism for programming high-performance computers and datacenters. However, little work exists on resilience schemes for RMA-based applications and systems. In this paper we analyze fault tolerance for RMA and show that it is fundamentally different from resilience mechanisms targeting the message passing (MP) model. We design a model for reasoning about fault tolerance for RMA, addressing both flat and hierarchical hardware. We use this model to construct several highly-scalable mechanisms that provide efficient low-overhead in-memory checkpointing, transparent logging of remote memory accesses, and a scheme for transparent recovery of failed processes. Our protocols take into account diminishing amounts of memory per core, one of the major features of future exascale machines. The implementation of our fault-tolerance scheme entails negligible additional overheads. Our reliability model shows that inmemory checkpointing and logging provide high resilience. This study enables highly-scalable resilience mechanisms for RMA and fills a research gap between fault tolerance and emerging RMA programming models.

show abstract

Remote Memory Access Programming in MPI-3

Cited by 81 publications

References 32 publications

Distributed join algorithms on thousands of cores

Distributed join algorithms on thousands of cores

Extreme Scale Plasma Turbulence Simulations on Top Supercomputers Worldwide

Fault tolerance for remote memory access programming models

Contact Info

Product

Resources

About