2020
DOI: 10.1177/1094342020915762
|View full text |Cite
|
Sign up to set email alerts
|

Scalability of high-performance PDE solvers

Abstract: Performance tests and analyses are critical to effective high-performance computing software development and are central components in the design and implementation of computational algorithms for achieving faster simulations on existing and future computing architectures for large-scale application problems. In this article, we explore performance and space-time trade-offs for important compute-intensive kernels of large-scale numerical solvers for partial differential equations (PDEs) that govern a wide rang… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
103
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
2

Relationship

4
2

Authors

Journals

citations
Cited by 68 publications
(110 citation statements)
references
References 15 publications
3
103
0
Order By: Relevance
“…The benchmark involves a continuous finite element discretization of the Laplacian (3), using matrix-free operator evaluation within a conjugate gradient solver preconditioned by the matrix diagonal. In this study, we consider the case BP5 [29], see also https://ceed.exascaleproject.org/bps/, which integrates the weak form (4) of polynomial degree k using a Gauss-Lobatto quadrature formula with n 1D q = k + 1 quadrature points on a cube with deformed elements. While this integration is not exact, it is the typical spectral element setup with an identity interpolation matrix S i = I in Eq.…”
Section: Performance-optimized Conjugate Gradient Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…The benchmark involves a continuous finite element discretization of the Laplacian (3), using matrix-free operator evaluation within a conjugate gradient solver preconditioned by the matrix diagonal. In this study, we consider the case BP5 [29], see also https://ceed.exascaleproject.org/bps/, which integrates the weak form (4) of polynomial degree k using a Gauss-Lobatto quadrature formula with n 1D q = k + 1 quadrature points on a cube with deformed elements. While this integration is not exact, it is the typical spectral element setup with an identity interpolation matrix S i = I in Eq.…”
Section: Performance-optimized Conjugate Gradient Methodsmentioning
confidence: 99%
“…Since the metric terms do not depend on the shape function indices i K and j , and the sum over j does not depend on i K , the summations in the equation can be broken up into (1) an dn q × n dof,ele matrix operation to evaluate the reference element derivative of u (K) at the quadrature points, (2) the application of metric terms as well as other physics terms at n q quadrature points, and (3) an n dof,ele × dn q matrix operation to test by all n dof,ele test functions and perform the summation over the quadrature points. The separation of point-wise physics evaluation at quadrature points is a common abstraction in integration-based matrix-free methods [29,41,49,50].…”
Section: Node-level Performance Through Matrix-free Implementationmentioning
confidence: 99%
See 1 more Smart Citation
“…To achieve this efficiency, high-order methods use mesh elements that are mapped from canonical reference elements (hexes, wedges, pyramids and tetrahedra) and exploit, where possible, the tensor-product structure of the canonical mesh elements and finite-element spaces. Through matrix-free partial assembly, the use of canonical reference elements enables substantial cache efficiency and minimizes extraneous data movement in comparison to traditional low-order approaches [5].…”
Section: (E) Efficient Finite-element Discretization Of Pdes On Unstrmentioning
confidence: 99%
“…The finite element method (FEM) with linear or quadratic elements (that reduce the potential for locking), or mixed or enhanced formulations (that avoid locking) are often adopted, whereas higher order elements are rarely employed. This can potentially reduce the advantage of matrix‐free methods using sum‐factorization techniques, which are generally more competitive for higher order elements . However, we note that, even with linear FEM, large‐scale computations with as many as 10 12 unknowns are possible with a matrix‐free approach but infeasible with sparse matrices as there is simply not enough memory to store the sparse tangent matrix even on large supercomputers .…”
Section: Introductionmentioning
confidence: 99%