An MPI/OpenACC implementation of a high-order electromagnetics solver with GPUDirect communication

Otten, Matthew; Gong, Junbo; Mametjanov, Azamat; Vose, Aaron; Levesque, John; Fischer, Paul; Min, Misun

doi:10.1177/1094342015626584

Cited by 36 publications

(17 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The lower-bound runtime is indicated by the granularity-limit line in the plots. In the present case, the GPU outperforms the CPU, but Otten et al 11 show that the CPUbased simulations can outperform the GPU from a pure speed standpoint for the case of N =7, where the granularity limit of the SEDG formulation is reduced to 343 points per core. Although the use of more cores allows the CPU-based simulations to be faster, it does not alter their overall energy consumption, which remains at 2.5× that of the GPU-based runs.…”

Section: Modeling Multi-gpu Performancementioning

confidence: 78%

“…A major difference is that one can amortize the nearest-neighbor communication costs by updating the surface flux terms for all six components of the vector-field pair (E, H) in a single pass. Figure 4 shows performance results for the OpenACC/GPU-based variant of NekCEM developed in Otten et al 11 Timing runs are presented for the Cray XK7, Titan, using one GPU per node. Also shown in panels (b) and (c) are multi-CPU runs using 1, 4, 8, and 16 cores per node on Titan and on the IBM BG/Q, Vesta, for P =1, 2, 4,. .…”

Section: Modeling Multi-gpu Performancementioning

confidence: 99%

“…Of paramount importance in the multi-GPU context is to know how many tasks are required to attain this efficiency, because that is the number that will set the granularity limit in an HPC setting. In this section, we analyze performance data taken from a recent OpenACC port of NekCEM to a multi-GPU implementation 11 and use it to illustrate some of the concepts introduced in the preceding sections.…”

Section: Modeling Multi-gpu Performancementioning

confidence: 99%

See 2 more Smart Citations

Scaling Limits for PDE-Based Simulation (Invited)

Fischer

2015

22nd AIAA Computational Fluid Dynamics Conference

View full text Add to dashboard Cite

show abstract

Section: Modeling Multi-gpu Performancementioning

confidence: 78%

Section: Modeling Multi-gpu Performancementioning

confidence: 99%

Section: Modeling Multi-gpu Performancementioning

confidence: 99%

See 1 more Smart Citation

Scaling Limits for PDE-Based Simulation (Invited)

Fischer

2015

22nd AIAA Computational Fluid Dynamics Conference

View full text Add to dashboard Cite

show abstract

“…Codes that utilise MPI+OpenACC include: the electromagnetics code NekCEM (Otten et al, 2016), the community atmosphere model -spectral element (CAM-SE) (Norman et al, 2015), and the combustion code S3D (Levesque et al, 2012). Codes that utilise MPI+OpenMP include computational fluid dynamics MFIX (Gel et al, 2009), Second-order Mller-Plesset perturbation theory (MP2) (Katouda and Nakajima, 2013), and molecular dynamics (Kunaseth et al, 2013).…”

Section: Related Workmentioning

confidence: 99%

Creating a portable, high-level graph analytics paradigm for compute and data-intensive applications

Searles

Herbein

Johnston

et al. 2019

IJHPCN

View full text Add to dashboard Cite

HPC offers tremendous potential to process large amounts of data often termed as big data. Distributing data efficiently and leveraging specialised hardware (e.g., accelerators) are critical in order to best utilise HPC platforms constituting of heterogeneous and distributed systems. In this paper, we develop a portable, high-level paradigm for such systems to run big data applications, more specifically, graph analytics applications popular in the big data and machine learning communities. Using our paradigm, we accelerate three real-world, compute and data intensive, graph analytics applications: a function call graph similarity application, a triangle enumeration subroutine, and a graph assaying application. Our paradigm utilises the MapReduce framework, Apache Spark, in conjunction with CUDA and simultaneously takes advantage of automatic data distribution and accelerator on each node of the system. We demonstrate scalability and parameter space exploration and offer a portable solution to leverage almost any legacy, current, or next-generation HPC or cloud-based system.

show abstract

“…Many applications take advantage of heterogeneous hardware using an approach known as MPI+X that leverages MPI for communication and an accelerator language (e.g., CUDA and OpenCL) or directive-based language (e.g., OpenMP and OpenACC) for computation. Codes that utilize MPI+OpenACC include: the electromagnetics code NekCEM (25), the Community Atmosphere Model -Spectral Element (CAM-SE) (22), and the combustion code S3D (20). Codes that utilize MPI+OpenMP include computational fluid dynamics MFIX (10), Second-order Mller-Plesset perturbation theory (MP2) (17), and Molecular Dynamics (19).…”

Section: Related Workmentioning

confidence: 99%

Creating a portable, high-level graph analytics paradigm for compute and data-intensive applications

Schleicher

Herbein

Johnston

et al. 2017

IJHPCN

View full text Add to dashboard Cite

Contents:IJHPCN provides an international forum to report, discuss and exchange experimental or theoretical results, novel designs, work-in-progress, experience, case studies, and trend-setting ideas. Papers should be of a quality that advances the state of the art in the field, bringing together the latest advances in technology, science, research, application, and education, and stimulating future trends in the areas.

show abstract

An MPI/OpenACC implementation of a high-order electromagnetics solver with GPUDirect communication

Cited by 36 publications

References 7 publications

Scaling Limits for PDE-Based Simulation (Invited)

Scaling Limits for PDE-Based Simulation (Invited)

Creating a portable, high-level graph analytics paradigm for compute and data-intensive applications

Creating a portable, high-level graph analytics paradigm for compute and data-intensive applications

Contact Info

Product

Resources

About