GPU Accelerated Discontinuous Galerkin Methods for Shallow Water Equations

Gandham, R.; Medina, David; Warburton, Tim

doi:10.4208/cicp.070114.271114a

Cited by 40 publications

(35 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Restriction of the global residual r N starts by scaling it by the inverted mass matrix to form y N = r N /m N and to use (6) to form the local vector y ijk;e . Then, y ijk;e is used to form the local coarse residual (see (9)):…”

Section: Coarse Grid Preconditionermentioning

confidence: 99%

GPU accelerated spectral finite elements on all-hex meshes

Remacle

Gandham

Warburton

2016

Journal of Computational Physics

View full text Add to dashboard Cite

This paper presents a spectral element finite element scheme that efficiently solves elliptic problems on unstructured hexahedral meshes. The discrete equations are solved using a matrix-free preconditioned conjugate gradient algorithm. An additive Schwartz two-scale preconditioner is employed that allows h-independence convergence. An extensible multi-threading programming API is used as a common kernel language that allows runtime selecti on of different computing devices (GPU and CPU) and different threading interfaces (CUDA, OpenCL and OpenMP). Performance tests demonstrate that problems with over 50 million degrees of freedom can be solved in a few seconds on an off-the-shelf GPU.

show abstract

Section: Coarse Grid Preconditionermentioning

confidence: 99%

GPU accelerated spectral finite elements on all-hex meshes

Remacle

Gandham

Warburton

2016

Journal of Computational Physics

View full text Add to dashboard Cite

show abstract

“…In this section, we compare the computational cost of hexahedra, wedges, and pyramids relative to the computational cost of tetrahedra. The results reported are for tuned computational kernels, where the number of elements processed per workgroup has been chosen in order to minimize the runtimes of the volume, surface, and update kernels for each element [34,17]. As suggested in [26], automation of this process is crucial for portable performance across various architectures, especially for hybrid meshes where parameters must be tuned for 12 separate kernels.…”

Section: Cost Per Element Typementioning

confidence: 99%

“…Since the details of the implementation are independent of element type, we refer the reader to [21,17,34] for a description of the multi-rate scheme on triangular and tetrahedral meshes.…”

mentioning

confidence: 99%

GPU-accelerated discontinuous Galerkin methods on hybrid meshes

Chan

Wang

Modave

et al. 2016

Journal of Computational Physics

View full text Add to dashboard Cite

We present a time-explicit discontinuous Galerkin (DG) solver for the time-domain acoustic wave equation on hybrid meshes containing vertex-mapped hexahedral, wedge, pyramidal and tetrahedral elements. Discretely energy-stable formulations are presented for both Gauss-Legendre and Gauss-Legendre-Lobatto (Spectral Element) nodal bases for the hexahedron. Stable timestep restrictions for hybrid meshes are derived by bounding the spectral radius of the DG operator using order-dependent constants in trace and Markov inequalities. Computational efficiency is achieved under a combination of element-specific kernels (including new quadrature-free operators for the pyramid), multi-rate timestepping, and acceleration using Graphics Processing Units.

show abstract

“…The local element-to-element coupling and the dense algebraic operations required per element make nodal DG methods suitable for parallel multi-threading computations, especially with GPUs. This implementation has successfully been adapted for several applications [Fuhry et al, 2014, Gandham et al, 2015, Godel et al, 2010, Modave et al, 2015.…”

Section: Introductionmentioning

confidence: 99%

A nodal discontinuous Galerkin method for reverse-time migration on GPU clusters

Modave¹,

St-Cyr

Mulder

et al. 2015

Geophys. J. Int.

View full text Add to dashboard Cite

Improving both accuracy and computational performance of numerical tools is a major challenge for seismic imaging and generally requires specialized implementations to make full use of modern parallel architectures. We present a computational strategy for reverse-time migration (RTM) with acceleratoraided clusters. A new imaging condition computed from the pressure and velocity fields is introduced. The model solver is based on a high-order discontinuous Galerkin time-domain (DGTD) method for the pressure-velocity system with unstructured meshes and multi-rate local time-stepping. We adopted the MPI+X approach for distributed programming where X is a threaded programming model. In this work we chose OCCA, a unified framework that makes use of major multi-threading languages (e.g. CUDA and OpenCL) and offers the flexibility to run on several hardware architectures. DGTD schemes are suitable for efficient computations with accelerators thanks to localized element-to-element coupling and the dense algebraic operations required for each element. Moreover, compared to high-order finite-difference schemes, the thin halo inherent to DGTD method reduces the amount of data to be exchanged between MPI processes and storage requirements for RTM procedures. The amount of data to be recorded during simulation is reduced by storing only boundary values in memory rather than on disk and recreating the forward wavefields. Computational results are presented that indicate that these methods are strong scalable up to at least 32 GPUs for a three-dimensional RTM case.

show abstract

GPU Accelerated Discontinuous Galerkin Methods for Shallow Water Equations

Cited by 40 publications

References 24 publications

GPU accelerated spectral finite elements on all-hex meshes

GPU accelerated spectral finite elements on all-hex meshes

GPU-accelerated discontinuous Galerkin methods on hybrid meshes

A nodal discontinuous Galerkin method for reverse-time migration on GPU clusters

Contact Info

Product

Resources

About