PCJ - Java library for high performance computing in PGAS model

Nowicki, Marek; Górski, Łukasz; Grabrczyk, Patryk; Bała, Piotr

doi:10.1109/hpcsim.2014.6903687

Cited by 11 publications

(2 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Several compliant languages and libraries have been discussed in Reference 16 including: original PGAS languages—CAF, Titanium, UPC; HPCS PGAS languages—Chapel, X10, Fortress; Retrospective PGAS languages—HPF, ZPL and GA as well as XCalableMP (XMP)—PGAS extension for C and Fortran. Notable recent examples include using PCJ for HPC systems, 17 big data processing, 18 clouds 19 as well as Shoal for clusters of processors and FPGAs 20 . HPX 21 is a C++ library developed for concurrency and parallelism that supports parallel, concurrent and distributed functions for general purpose programming, in particular active global address space (AGAS) that allows moving objects between nodes without changing addresses.…”

Section: Related Workmentioning

confidence: 99%

A multithreaded CUDA and OpenMP based power‐aware programming framework for multi‐node GPU systems

Czarnul

2023

Concurrency and Computation

View full text Add to dashboard Cite

SummaryIn the article, we have proposed a framework that allows programming a parallel application for a multi‐node system, with one or more graphical processing units (GPUs) per node, using an OpenMP+extended CUDA API. OpenMP is used for launching threads responsible for management of particular GPUs and extended CUDA calls allow to transfer data and launch kernels on local and remote GPUs. The framework hides inter‐node MPI communication from the programmer. For optimization, the implementation takes advantage of the MPI_THREAD_MULTIPLE mode allowing: multiple threads handling distinct GPUs as well as overlapping communication and computations transparently using multiple CUDA streams. The solution allows data parallelization across available GPUs in order to minimize execution time and supports a power‐aware mode in which GPUs are automatically selected for computations using a greedy approach in order not to exceed an imposed power limit. We have implemented and benchmarked three parallel applications including: finding the largest divisors; verification of the Collatz conjecture; finding patterns in vectors. These were tested on three various systems: a GPU cluster with 16 nodes, each with NVIDIA GTX 1060 GPU; a powerful 2‐node system—one node with 8 NVIDIA Quadro RTX 6000 GPUs, the second with 4 NVIDIA Quadro RTX 5000 GPUs; a heterogeneous environment with one node with 2 NVIDIA RTX 2080 and 2 nodes with NVIDIA GTX 1060 GPUs. We demonstrated effectiveness of the framework through execution times versus power caps within ranges of 100–1400 W, 250–3000 W, and 125–600 W for these systems respectively as well as gains from using two versus one CUDA streams per GPU. Finally, we have shown that for the testbed applications the solution allows to obtain high speed‐ups between 89.3% and 97.4% of the theoretically assessed ideal ones, for 16 nodes and 2 CUDA streams, demonstrating very good parallel efficiency.

show abstract

Section: Related Workmentioning

confidence: 99%

A multithreaded CUDA and OpenMP based power‐aware programming framework for multi‐node GPU systems

Czarnul

2023

Concurrency and Computation

View full text Add to dashboard Cite

show abstract

“…Because the use of PGAS languages is familiar in one-sided communication, applications in PGAS languages can sometimes exhibit higher performance than those using MPI library by directly using a communication layer close to hardware [5,6]. Examples of PGAS languages include XcalableMP (XMP) [5,7,8]; XcalableACC [9][10][11]; Coarray Fortran [12], PCJ [13], Unified Parallel C, 1 UPC++ [14], HabaneroUPC++ [15], X10 [16], Chapel [17], and DASH [18].…”

Section: Introductionmentioning

confidence: 99%

Mixed-Language Programming with XcalableMP

Nakao

2020

XcalableMP PGAS Programming Language

View full text Add to dashboard Cite

This chapter presents the mixed-language programming with XcalableMP and other programming languages. It is supported by the linkage functions between XcalableMP and MPI library. We also demonstrate how to call XcalableMP program from Python program (M. Nakao et al., Linkage of XcalableMP and Python languages for high productivity on HPC cluster system, Proceedings of Workshops of HPC Asia, No .9, pp.39–47, 2018).

show abstract