2021
DOI: 10.1155/2021/6804723
|View full text |Cite
|
Sign up to set email alerts
|

Developing a Multi-GPU-Enabled Preconditioned GMRES with Inexact Triangular Solves for Block Sparse Matrices

Abstract: Solving triangular systems is the building block for preconditioned GMRES algorithm. Inexact preconditioning becomes attractive because of the feature of high parallelism on accelerators. In this paper, we propose and implement an iterative, inexact block triangular solve on multi-GPUs based on PETSc’s framework. In addition, by developing a distributed block sparse matrix-vector multiplication procedure and investigating the optimized vector operations, we form the multi-GPU-enabled preconditioned GMRES with … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 29 publications
0
2
0
Order By: Relevance
“…One has been done was by using the accelerated projection-based consensus. 17 Other methods, by utilizing the parallel iterative methods, for instance, such as by Sultanov et al, 18 Ma et al, 19 and Anzt et al 20 The recent parallel works on GMRES running on NVIDIA GPGPU accelerator for solving systems of linear equations based on the sparse matrices, were discussed by Minin et al Figure 3. The concept of parallel tasks on several CPUs.…”
Section: Parallel Of Switching Modelsmentioning
confidence: 99%
“…One has been done was by using the accelerated projection-based consensus. 17 Other methods, by utilizing the parallel iterative methods, for instance, such as by Sultanov et al, 18 Ma et al, 19 and Anzt et al 20 The recent parallel works on GMRES running on NVIDIA GPGPU accelerator for solving systems of linear equations based on the sparse matrices, were discussed by Minin et al Figure 3. The concept of parallel tasks on several CPUs.…”
Section: Parallel Of Switching Modelsmentioning
confidence: 99%
“…In [23,[38][39][40][41], most of them attempted the left-looking or the hybrid columnbased right-looking algorithm to perform a parallel LU decomposition, accelerating the sparse linear solver. However, multi-GPU methods have also been proposed to improve the performance of sparse linear solves [42][43][44][45][46] with the development of parallel devices. The cost of communication, however, has become the main bottleneck rather than computing capability.…”
Section: Introductionmentioning
confidence: 99%