Scalability of high-performance PDE solvers

Fischer, Paul; Min, Misun; Rathnayake, Thilina; Dutta, Som; Kolev, Tzanio; Dobrev, Veselin; Camier, Jean-Sylvain; Kronbichler, Martin; Warburton, Tim; Świrydowicz, Kasia; Brown, Jed

doi:10.1177/1094342020915762

Cited by 68 publications

(110 citation statements)

References 15 publications

Supporting

Mentioning

103

Contrasting

Order By: Relevance

“…The benchmark involves a continuous finite element discretization of the Laplacian (3), using matrix-free operator evaluation within a conjugate gradient solver preconditioned by the matrix diagonal. In this study, we consider the case BP5 [29], see also https://ceed.exascaleproject.org/bps/, which integrates the weak form (4) of polynomial degree k using a Gauss-Lobatto quadrature formula with n 1D q = k + 1 quadrature points on a cube with deformed elements. While this integration is not exact, it is the typical spectral element setup with an identity interpolation matrix S i = I in Eq.…”

Section: Performance-optimized Conjugate Gradient Methodsmentioning

confidence: 99%

“…Since the metric terms do not depend on the shape function indices i K and j , and the sum over j does not depend on i K , the summations in the equation can be broken up into (1) an dn q × n dof,ele matrix operation to evaluate the reference element derivative of u (K) at the quadrature points, (2) the application of metric terms as well as other physics terms at n q quadrature points, and (3) an n dof,ele × dn q matrix operation to test by all n dof,ele test functions and perform the summation over the quadrature points. The separation of point-wise physics evaluation at quadrature points is a common abstraction in integration-based matrix-free methods [29,41,49,50].…”

Section: Node-level Performance Through Matrix-free Implementationmentioning

confidence: 99%

“…• support for both continuous [49] and discontinuous finite elements on uniform and adaptively refined meshes with hanging nodes and deformed elements, • support for arbitrary polynomial expansions on quadrilateral and hexahedral element shapes as well as tensor product quadrature rules, • minimization of arithmetic operations by using available symmetries, such as the even-odd decomposition [69] and a switch between the collocation derivative (5) for n 1D q ≈ k + 1 quadrature points or an alternative variant based on derivatives of the original polynomials as used in [49] and discussed in [29], • flexible implementation of operations at quadrature points, • vectorization across several elements to optimally use SIMD units (AVX, AVX-512, AltiVec) of modern processors, • applicability to modern multi-core CPUs as well as GPUs [51,57], • data access optimizations such as element-based loops for DG elements [50,56], • and MPI implementation with tight data exchange as well as MPI-only and shared-memory models [43,48,54].…”

Section: Implementation Of Sum Factorization In the Dealii Librarymentioning

confidence: 99%

See 2 more Smart Citations

ExaDG: High-Order Discontinuous Galerkin for the Exa-Scale

Arndt

Fehn

Kanschat

et al. 2020

Lecture Notes in Computational Science and Engineering

Self Cite

View full text Add to dashboard Cite

This text presents contributions to efficient high-order finite element solvers in the context of the project ExaDG, part of the DFG priority program 1648 Software for Exascale Computing (SPPEXA). The main algorithmic components are the matrix-free evaluation of finite element and discontinuous Galerkin operators with sum factorization to reach a high node-level performance and parallel scalability, a massively parallel multigrid framework, and efficient multigrid smoothers. The algorithms have been applied in a computational fluid dynamics context. The software contributions of the project have led to a speedup by a factor 3 − 4 depending on the hardware. Our implementations are available via the deal.II finite element library.

show abstract

Section: Performance-optimized Conjugate Gradient Methodsmentioning

confidence: 99%

Section: Node-level Performance Through Matrix-free Implementationmentioning

confidence: 99%

Section: Implementation Of Sum Factorization In the Dealii Librarymentioning

confidence: 99%

See 1 more Smart Citation

ExaDG: High-Order Discontinuous Galerkin for the Exa-Scale

Arndt

Fehn

Kanschat

et al. 2020

Lecture Notes in Computational Science and Engineering

Self Cite

View full text Add to dashboard Cite

show abstract

“…To achieve this efficiency, high-order methods use mesh elements that are mapped from canonical reference elements (hexes, wedges, pyramids and tetrahedra) and exploit, where possible, the tensor-product structure of the canonical mesh elements and finite-element spaces. Through matrix-free partial assembly, the use of canonical reference elements enables substantial cache efficiency and minimizes extraneous data movement in comparison to traditional low-order approaches [5].…”

Section: (E) Efficient Finite-element Discretization Of Pdes On Unstrmentioning

confidence: 99%

Exascale applications: skin in the game

Alexander

Almgren

Bell

et al. 2020

Phil. Trans. R. Soc. A.

Self Cite

View full text Add to dashboard Cite

show abstract

“…The finite element method (FEM) with linear or quadratic elements (that reduce the potential for locking), or mixed or enhanced formulations (that avoid locking) are often adopted, whereas higher order elements are rarely employed. This can potentially reduce the advantage of matrix‐free methods using sum‐factorization techniques, which are generally more competitive for higher order elements . However, we note that, even with linear FEM, large‐scale computations with as many as 10 12 unknowns are possible with a matrix‐free approach but infeasible with sparse matrices as there is simply not enough memory to store the sparse tangent matrix even on large supercomputers .…”

Section: Introductionmentioning

confidence: 99%

A matrix‐free approach for finite‐strain hyperelastic problems using geometric multigrid

Davydov

Pelteret

Arndt

et al. 2020

Numerical Meth Engineering

Self Cite

View full text Add to dashboard Cite

The performance of finite element solvers on modern computer architectures is typically memory bound for sufficiently large problems. The main cause for this is that loading matrix elements from RAM into CPU cache is significantly slower than performing the arithmetic operations when solving the problem. In order to improve the performance of iterative solvers within the high-performance computing context, so-called matrix-free methods are widely adopted in the fluid mechanics community, where matrix-vector products are computed on-the-fly.To date, there have been few (if any) assessments into the applicability of the matrix-free approach to problems in solid mechanics. In this work, we perform an initial investigation on the application of the matrix-free approach to problems in quasi-static finite-strain hyperelasticity to determine whether it is viable for further extension. Specifically, we study different numerical implementations of the finite element tangent operator, and determine whether generalized methods of incorporating complex constitutive behavior might be feasible. In order to improve the convergence behavior of iterative solvers, we also propose a method by which to construct level tangent operators and employ them to define a geometric multigrid preconditioner. The performance of the matrix-free operator and the ge- * Corresponding author. ometric multigrid preconditioner is compared to the matrix-based implementation with an algebraic multigrid preconditioner on a single node for a representative numerical example of a heterogeneous hyperelastic material in two and three dimensions. We conclude that the application of matrix-free methods to finite-strain solid mechanics is promising, and that is it possible to develop numerically efficient implementations that are independent of the hyperelastic constitutive law.

show abstract

Scalability of high-performance PDE solvers

Cited by 68 publications

References 15 publications

ExaDG: High-Order Discontinuous Galerkin for the Exa-Scale

ExaDG: High-Order Discontinuous Galerkin for the Exa-Scale

Exascale applications: skin in the game

A matrix‐free approach for finite‐strain hyperelastic problems using geometric multigrid

Contact Info

Product

Resources

About