2020
DOI: 10.48550/arxiv.2011.00715
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Toward Performance-Portable PETSc for GPU-based Exascale Systems

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
3
1

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(7 citation statements)
references
References 15 publications
0
7
0
Order By: Relevance
“…For the preconditioning step, we consider a smoothed aggregation algebraic multigrid method constructed on the matrix C, using a diagonally preconditioned Chebyshev method as a smoother. The setup of the preconditioner runs partly on the CPU and partly on the GPU, while the Krylov solver, including the preconditioner application, runs entirely on the GPU [42,56].…”
Section: Strong Scalabilitymentioning
confidence: 99%
“…For the preconditioning step, we consider a smoothed aggregation algebraic multigrid method constructed on the matrix C, using a diagonally preconditioned Chebyshev method as a smoother. The setup of the preconditioner runs partly on the CPU and partly on the GPU, while the Krylov solver, including the preconditioner application, runs entirely on the GPU [42,56].…”
Section: Strong Scalabilitymentioning
confidence: 99%
“…A symmetric address is the address of a symmetric object on the local PE, plus an offset if needed. The code below allocates two symmetric double arrays src [1] and dst [2], and every PE puts a double from its src[0] to the next PE's dst [1].…”
Section: Stream-aware Nvshmem Supportmentioning
confidence: 99%
“…In [1] we discuss the plans and progress in adapting the Portable, Extensible Toolkit for Scientific Computation and Toolkit for Advanced Optimization [2] (PETSc) to CPU-GPU systems. This paper focuses specifically on the plans for managing the network and intra-node communication within PETSc.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Iterative linear solvers are often preferred for solving large-scale linear systems, as they can take advantage of problem structure such as sparsity or bandedness, require inexpensive floating point operations, and can be readily paired with preconditioning techniques [19, see preface]. While such iterative linear solvers as Conjugate Gradients (CG) and the Generalized Minimal Residual method (GM-RES) are still dominant solvers in practice, randomized row-action [8,1,14,23] and column-action iterative solvers [10,25] have been growing in interest for several reasons: they (usually) require very few floating point operations per iteration [5,3]; they have low-memory footprints [9]; they can readily be composed with randomization techniques to quickly produce approximate solutions [23,10,24,6,11,2,7,17]; they can be used for solving systems constructed in a streaming fashion (e.g., [15]), which supports emerging computing paradigms (e.g., [13]); and, just like the more popular iterative Krylov solvers, they can be parallelized, preconditioned or combined with other linear solvers [20,16,4,18];…”
Section: Introductionmentioning
confidence: 99%