A GPU accelerated discontinuous Galerkin incompressible flow solver

Karakus, Ali; Chalmers, Noel; Świrydowicz, Kasia; Warburton, Tim

doi:10.1016/j.jcp.2019.04.010

Cited by 30 publications

(23 citation statements)

References 42 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Comprehensive work on the topic has been undertaken by Karakus et al [53] who introduced a GPU-accelerated DG solver that uses a semi-lagragian subcycling approach and computes the pressure with a preconditioned conjugate gradient method.…”

Section: Incompressible Flowsmentioning

confidence: 99%

High-Order Incompressible Computational Fluid Dynamics on Modern Hardware Architectures

Loppi¹

2021

Preprint

View full text Add to dashboard Cite

Modern GPU architectures are characterised by an abundance of compute capability relative to memory bandwidth. This makes them very well-suited to solving temporally explicit and spatially compact discretisations of hyperbolic conservation laws. However, classical pressure-projection-based incompressible Navier–Stokes formulations do not fall into this category. One attractive formulation for solving incompressible problems on modern hardware is the method of artificial compressibility. When combined with explicit dual time stepping and a high-order Flux Reconstruction discretisation, the majority of operations can be cast as arithmetically intensive matrix–matrix multiplications that are well-suited for GPUs. In this seminar, I will present the high-order cross-platform incompressible Navier–Stokes solver in PyFR, together with three explicit convergence acceleration techniques: a polynomial multigrid, a novel locally adaptive pseudo-time stepping approach and novel stability-optimised Runge-Kutta schemes. The solver and the convergence acceleration techniques are validated for a range of turbulent test cases, including a simulation of the DARPA SUBOFF submarine model using hundreds of NVIDIA GPUs.

show abstract

Section: Incompressible Flowsmentioning

confidence: 99%

High-Order Incompressible Computational Fluid Dynamics on Modern Hardware Architectures

Loppi¹

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…In the standard case of simplicial or mapped box-type elements, the use of nodal basis functions is common practice as it offers significant computational savings [20,28,26,15,25,35,32,16]. This is because nodal basis functions allow for offline precomputation of the evaluation of the basis functions on the quadrature points, which are transferred into the physical domain via elemental maps.…”

Section: Computing the Integralsmentioning

confidence: 99%

“…The benefits of GPU-acceleration in the context of discontinuous Galerkin methods have been studied extensively in the literature over the last decade or so for various classes of electromagnetic, fluid flow and other hyperbolic PDE problems; we refer to [28,26,15,25,35,32,16] for some of the most successful results in the area. The predominant application setting involves explicit time-stepping, e.g., by structure-preserving Runge-Kutta methods, combined with discontinuous Galerkin spatial discretizations with nodal representation of local finite element spaces [20].…”

mentioning

confidence: 99%

“…With regard to shape generality, each element is allowed to be a general polytope with ar-bitrary number of (d − 1)-dimensional polytopic faces; we refer to [13] for a detailed discussion on the definition and structure of dG methods on polytopic meshes. The element-shape generality requires both new data structures as well as the resolution of new algorithmic challenges, compared to dG implementations on standard simplicial or box-type meshes [28,26,15,25,35,32,16]. The algorithms presented below aim use parallelization within GPU clusters to address the key challenge of reducing the computational cost of arbitrary order quadrature rules over general polytopic domains.…”

mentioning

confidence: 99%

See 1 more Smart Citation

GPU-Accelerated Discontinuous Galerkin Methods on Polytopic Meshes

Dong¹,

Georgoulis²,

Kappas³

2021

SIAM J. Sci. Comput.

View full text Add to dashboard Cite

Discontinuous Galerkin (dG) methods on meshes consisting of polygonal/polyhedral (henceforth, collectively termed as polytopic) elements have received considerable attention in recent years. Due to the physical frame basis functions used typically and the quadrature challenges involved, the matrix-assembly step for these methods is often computationally cumbersome. To address this important practical issue, this work proposes two parallel assembly implementation algorithms on CUDA-enabled graphics cards for the interior penalty dG method on polytopic meshes for various classes of linear PDE problems. We are concerned with both single GPU parallelization, as well as with implementation on distributed GPU nodes. The results included showcase almost linear scalability of the quadrature step with respect to the number of GPU-cores used, since no communication is needed for the assembly step. In turn, this can justify the claim that polytopic dG methods can be implemented extremely efficiently, as any assembly computing time overhead compared to finite elements on 'standard' simplicial or box-type meshes can be effectively circumvented by the proposed algorithms.

show abstract

“…By using the OCCA [19,20] library's unified API, NekRS can run on CPUs and on GPU-accelerated CPUs that support CUDA, HIP, or OpenCL. For performance portability, the code is based on the open concurrent compute abstraction and leverages scalable developments in the SEM code Nek5000 and in libParanumal [21,22], which is a library of high-performance kernels for high-order discretization and PDE-based mini-apps. Critical performance results on several platforms indicates the strong scaling of NekRS including scaling to 27,648 V100s on OLCF Summit, for calculations of up to 60B grid points [23].…”

Section: Introduction On Nekrsmentioning

confidence: 99%

Assessment of Fast Reactor Hot Channel Factor Calculation Capability in Griffin and NekRS

Shemon¹,

Yu²,

Park³

et al. 2021

View full text Add to dashboard Cite

The Laboratory's main facility is outside Chicago, at 9700 South Cass Avenue, Argonne, Illinois 60439. For information about Argonne and its pioneering science and technology programs, see www.anl.gov. DOCUMENT AVAILABILITYOnline Access: U.S. Department of Energy (DOE) reports produced after 1991 and a growing number of pre-1991 documents are available free at OSTI.GOV (http://www.osti.gov/), a service of the US Dept. of Energy's Office of Scientific and Technical Information.

show abstract

A GPU accelerated discontinuous Galerkin incompressible flow solver

Cited by 30 publications

References 42 publications

High-Order Incompressible Computational Fluid Dynamics on Modern Hardware Architectures

High-Order Incompressible Computational Fluid Dynamics on Modern Hardware Architectures

GPU-Accelerated Discontinuous Galerkin Methods on Polytopic Meshes

Assessment of Fast Reactor Hot Channel Factor Calculation Capability in Griffin and NekRS

Contact Info

Product

Resources

About