Nodal discontinuous Galerkin methods on graphics processors

Klöckner, Andreas; Warburton, Tim; Bridge, J.; Hesthaven, Jan S.

doi:10.1016/j.jcp.2009.06.041

Cited by 283 publications

(256 citation statements)

References 16 publications

Supporting

Mentioning

253

Contrasting

Unclassified

Order By: Relevance

“…This is related to the fact that the algorithm is memory bandwidth limited rather than compute limited. It is well known that the use of linear finite elements with explicit time steps results in relatively few calculations being performed for a given amount of data being loaded from the memory; this is why it has been proposed to use higher order elements on GPUs [35] to increase the number of calculations per node at each time step. In the case of bandwidth limited algorithms such as this, since the amount of data which must be loaded from memory doubles for double precision, the run time will be expected to approximately double.…”

Section: Discussionmentioning

confidence: 99%

“…Two partitioning schemes are considered in this paper. The first is the simple 'greedy partitioner'; the algorithm follows that of [35]. The second is a more efficient partitioning scheme -the 'aligned partitioner' -which has been developed to subdivide the mesh into neat aligned blocks.…”

Section: Partitioningmentioning

confidence: 99%

“…The resulting 125-node elements fitted well within a block of 128 threads, which could then be efficiently arranged in memory; while this is an elegant solution it is unlikely to be practical for the general case, particularly when considering lower order problems. Klöckner et al [35] recognised the problem for the more general case, where many elements must be allocated to each block; a partitioning scheme was therefore required to divide the mesh into blocks in an efficient manner. It was recognised that general purpose partitioners such as those developed for parallel computing [36] were rarely suitable, since the specific nature of the hardware configuration requires particular limits on size which are challenging to implement.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Accelerated finite element elastodynamic simulations using the GPU

Huthwaite

2014

Journal of Computational Physics

180

121

View full text Add to dashboard Cite

Section: Discussionmentioning

confidence: 99%

Section: Partitioningmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Accelerated finite element elastodynamic simulations using the GPU

Huthwaite

2014

Journal of Computational Physics

180

121

View full text Add to dashboard Cite

“…A recent review of applications in FEM (Finite Element Method)-based structural mechanics can be found in [1], but GPUs are being used also in a variety of other contexts. In [2], the authors apply high-order Discontinuos Galerkin (DG) method to the solution of Maxwell's equa-tions. In DG methods most operators are defined locally at element level.…”

Section: Introductionmentioning

confidence: 99%

“…Hybrid implementations of multiscale finite element approaches, whereby constitutive material computations at Gauss point level are carried out on the GPU, are discussed in [13] and [14] In most of the above-mentioned applications, results are limited to single-precision arithmetic (e.g. [2,9,14,10]). It is important to recall, as suggested in [8], that, concerning NVIDIA GPUs, in all the implementations before CUDA compute capability 1.3, only single-precision floating point operations are supported directly by the hardware.…”

Section: Introductionmentioning

confidence: 99%

An explicit dynamics GPU structural solver for thin shell finite elements

Bartezzaghi

Cremonesi

Parolini

et al. 2015

Computers & Structures

View full text Add to dashboard Cite

With the availability of user oriented software tools, dedicated architectures, such as the parallel computing platform and programming model CUDA (Compute Unified Device Architecture) released by NVIDIA, one of the main producers of graphics cards, and of improved, highly performing GPU (Graphics Processing Unit) boards, GPGPU (General Purpose programming on GPU) is attracting increasing interest in the engineering community, for the development of analysis tools suitable to be used in validation/verification and virtual reality applications. For their inherent explicit and decoupled structure, explicit dynamics finite element formulations appear to be particularly attractive for implementations on hybrid CPU/GPU or pure GPU architectures. The issue of an optimized, double-precision finite element GPU implementation of an explicit dynamics finite element solver for elastic shell problems in small strains and large displacements and rotations, using unstructured meshes, is here addressed. The conceptual difference between a GPU implementation directly adapted from a standard CPU approach and a new optimized formulation, specifically conceived for GPUs, is discussed and comparatively assessed. It is shown that a speedup factor of about 5 can be achieved by an optimized algorithm reformulation and careful memory management. A speedup of more than 40 is achieved with respect of state-of-the art commercial codes running on CPU, obtaining real-time simulations in some cases, on commodity hardware. When a last generation GPU board is used, it is shown that a problem with more than 16 millions degrees of freedom can be solved in just few hours of computing time, opening the way to virtualization approaches for real large scale engineering problems.

show abstract

Discontinuous Galerkin Time‐Domain Method in Electromagnetics: From Nanostructure Simulations to Multiphysics Implementations

Dong

Chen

et al. 2022

Advances in Time‐Domain Computational Electromagnetic Methods

View full text Add to dashboard Cite

In the last few decades, the discontinuous Galerkin time-domain (DGTD) method has become widely popular in various fields of engineering due the fact that it benefits from computational advantages that come with finite volume and finite element formulations. Similarly, in the field of computational electromagnetics, the superiority of the DGTD method has been quickly recognized after first few works on its formulation and 1 implementation to solve Maxwell equations. With further developments in more recent years, the DGTD method has become one of the preeminent solutions to tackle a wide variety of challenging large scale electromagnetic problems including those that require multiphysics modeling.This chapter starts with a brief introduction to the DGTD method. This introduction provides the fundamentals of numerical flux, discretization techniques that rely on vector and nodal basis functions, and incorporation of absorbing boundary conditions. This is followed by descriptions of a time-domain boundary integral(TDBI) scheme, which replaces absorbing boundary conditions within the DGTD method, and a multi-step time integration technique, which uses different time step sizes for the DGTD and TDBI parts. Numerical results show that both techniques significantly improve the efficiency, accuracy, and stability of the traditional DGTD method. Then, the chapter continues with the applications of the DGTD method to several real-life practical problems. More specifically, it describes various novel techniques developed to enable the application of the DGTD method to electromagnetic analysis of nanostructures and graphene-based devices, and multiphysics simulation of optoelectronic antennas and source generators. For each application, several numerical examples are provided to demonstrate the accuracy, efficiency, and robustness of the developed techniques.

show abstract

Nodal discontinuous Galerkin methods on graphics processors

Cited by 283 publications

References 16 publications

Accelerated finite element elastodynamic simulations using the GPU

Accelerated finite element elastodynamic simulations using the GPU

An explicit dynamics GPU structural solver for thin shell finite elements

Discontinuous Galerkin Time‐Domain Method in Electromagnetics: From Nanostructure Simulations to Multiphysics Implementations

Contact Info

Product

Resources

About