FastMPJ: a scalable and efficient Java message-passing library

Expósito, Roberto R.; Ramos, Sabela; Taboada, Guillermo L.; Touriño, Juan; Doallo, Ramón

doi:10.1007/s10586-014-0345-4

Cited by 8 publications

(9 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Moreover, Java is attractive for parallel processing. Programmers can use FastMPJ -message-passing library similar to well-established MPI [12] or OpenMP-like directives [13]. An example of parallel, multithreaded Java library is Parallel Colt proposed by P. Wendykier and J.G.…”

Section: B Modern Programming Platformmentioning

confidence: 99%

Modern Platform for Parallel Algorithms Testing: Java on Intel Xeon Phi

Malinowski¹

2015

IJITCS

View full text Add to dashboard Cite

Abstract-Parallel algorithms are popular method of increasing system performance. Apart from showing their properties using asymptotic analysis, proof-of-concept implementation and practical experiments are often required. In order to speed up the development and provide simple and easily accessible testing environment that enables execution of reliable experiments, the paper proposes a platform with multi-core computational accelerator: Intel Xeon Phi, and modern programming language: Java. The article includes the description of integration Java with Xeon Phi, as well as detailed information about all of the software components. Finally, the set of tests proves, that proposed platform is able to prepare reliable experiments of parallel algorithms implemented in modern programming language.

show abstract

Section: B Modern Programming Platformmentioning

confidence: 99%

Modern Platform for Parallel Algorithms Testing: Java on Intel Xeon Phi

Malinowski¹

2015

IJITCS

View full text Add to dashboard Cite

show abstract

“…Java is the main language in industry and academia, and it is increasingly being adopted by the High Performance Computing (HPC) community because of its appealing features such as multi‐thread and networking support in the core of the language, and its improvements in performance, which makes it competitive regarding natively compiled languages like C/C++. Thus, different high performance Java projects have emerged in the past few years , including several implementations of message passing libraries like FastMPJ , which provides high performance communications for different devices, from InfiniBand to shared memory transfers. Since its first release, FastMPJ has paid special attention to collective operations because of their wide use in parallel codes and has provided the user with a set of optimized multi‐core aware algorithms for each operation that can be selected at runtime regarding the number of processes and the message size.…”

Section: Introductionmentioning

confidence: 99%

Nonblocking collectives for scalable Java communications

Ramos

Taboada

Expósito

et al. 2014

Concurrency and Computation

Self Cite

View full text Add to dashboard Cite

SummaryThis paper presents a Java implementation of the recently published MPI 3.0 nonblocking message passing collectives in order to analyze and assess the feasibility of taking advantage of these operations in shared memory systems using Java. Nonblocking collectives aim to exploit the overlapping between computation and communication for collective operations to increase scalability of message passing codes, as it has been carried out for nonblocking point‐to‐point primitives. This scalability has become crucial not only for clusters but also for shared memory systems because of the current trend of increasing the number of cores per chip, which is leading to the generalization of multi‐core and many‐core processors. Message passing libraries based on remote direct memory access, thread‐based progression, or implementing pure multi‐threading shared memory support could potentially benefit from the lack of imposed synchronization by nonblocking collectives. But, although the distributed memory scenario has been well studied, the shared memory one has not been tackled yet. Hence, nonblocking collectives support has been included in FastMPJ, a Message Passing in Java (MPJ) implementation, and evaluated on a representative shared memory system, obtaining significant improvements because of overlapping and lack of implicit synchronization, and with barely any overhead imposed over common blocking operations. Copyright © 2014 John Wiley & Sons, Ltd.

show abstract

“…Finally, FastMPJ [9] is our Java message-passing implementation that includes a layered design approach similar to MPJ Express, but avoiding its data buffering overhead by supporting direct communication of any serializable Java object. Moreover, FastMPJ includes a scalable collective library which implements up to six algorithms per collective primitive.…”

mentioning

confidence: 99%

“…Previous MPJ middleware (e.g., mpiJava and MPJ Express) can also provide this specific support (i.e., not using TCP/IP emulations), but only when relying on an underlying native message-passing library. Therefore, FastMPJ communication devices must conform with the API provided by the abstract class xxdev.Device [9]. For instance, Liu et al [49,50] explored the feasibility of providing high-performance RDMA communications over InfiniBand in the context of the MPICH project [51].…”

mentioning

confidence: 99%

“…These xxdev devices abstract the particular operation of a communication protocol conforming to an API on top of which FastMPJ implements its communications. Therefore, FastMPJ communication devices must conform with the API provided by the abstract class xxdev.Device [9]. The lowlevel xxdev API only provides basic point-to-point communication methods and is not aware of higher level MPI abstractions like communicators.…”

mentioning

confidence: 99%

See 1 more Smart Citation

Low‐latency Java communication devices on RDMA‐enabled networks

Expósito

Taboada

Ramos

et al. 2015

Concurrency and Computation

Self Cite

View full text Add to dashboard Cite

Providing high-performance inter-node communication is a key capability for running high performance computing applications efficiently on parallel architectures. In fact, current systems deployments are aggregating a significant number of cores interconnected via advanced networking hardware with Remote Direct Memory Access (RDMA) mechanisms, that enable zero-copy and kernel-bypass features. The use of Java for parallel programming is becoming more promising thanks to some useful characteristics of this language, particularly its built-in multithreading support, portability, easy-to-learn properties, and high productivity, along with the continuous increase in the performance of the Java virtual machine. However, current parallel Java applications generally suffer from inefficient communication middleware, mainly based on protocols with high communication overhead that do not take full advantage of RDMA-enabled networks. This paper presents efficient low-level Java communication devices that overcome these constraints by fully exploiting the underlying RDMA hardware, providing low-latency and high-bandwidth communications for parallel Java applications. The performance evaluation conducted on representative RDMA networks and parallel systems has shown significant point-to-point performance increases compared with previous Java communication middleware, allowing to obtain up to 40% improvement in application-level performance on 4096 cores of a Cray XE6 supercomputer.

show abstract

FastMPJ: a scalable and efficient Java message-passing library

Cited by 8 publications

References 22 publications

Modern Platform for Parallel Algorithms Testing: Java on Intel Xeon Phi

Modern Platform for Parallel Algorithms Testing: Java on Intel Xeon Phi

Nonblocking collectives for scalable Java communications

Low‐latency Java communication devices on RDMA‐enabled networks

Contact Info

Product

Resources

About