2020
DOI: 10.1007/978-3-030-43229-4_10
|View full text |Cite
|
Sign up to set email alerts
|

Parallel Performance of an Iterative Solver Based on the Golub-Kahan Bidiagonalization

Abstract: We present a scalability study of Golub-Kahan bidiagonalization for the parallel iterative solution of symmetric indefinite linear systems with a 2 × 2 block structure. The algorithms have been implemented within the parallel numerical library PETSc. Since a nested inner-outer iteration strategy may be necessary, we investigate different choices for the inner solvers, including parallel sparse direct and multigrid accelerated iterative methods. We show the strong and weak scalability of the Golub-Kahan bidiago… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
5
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5

Relationship

3
2

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 24 publications
1
5
0
Order By: Relevance
“…We observe that the speed-up reaches half of the ideal speed-up. Such a result is consistent to what we observed in [14].…”
Section: Strong Scalingsupporting
confidence: 93%
See 3 more Smart Citations
“…We observe that the speed-up reaches half of the ideal speed-up. Such a result is consistent to what we observed in [14].…”
Section: Strong Scalingsupporting
confidence: 93%
“…Here, we focus on the aforementioned Craig's variant algorithm for the solution of saddle point systems, which is presented in Algorithm 1. As stopping criterion, we use a normalized lower bound estimate of the energy norm error e k := u − u k M described in [13,14]. The algorithm stops once this normalized lower bound undershoots a sufficiently small tolerance τ .…”
Section: The Algorithmmentioning
confidence: 99%
See 2 more Smart Citations
“…As the reactor containment building test case from EDF was not yet available for the strong and weak scalability investigations, we have adapted Stokes sample problems provided in the PETSc distribution. In this work, we focus on extending and improving the parallel results presented in our conference contribution 8 with the following novel aspects: We ran the examples on a larger number of cores, that is, 1024, and we studied the scalability of a bigger matrix for the Poiseuille flow test case ( m ≈16.8·10 6 , n ≈ 8.4·10 6 ). We introduce a three‐dimensional Stokes example and we comment on its strong scalability. We discuss the weak scalability of the nested inner‐outer iterative variants of our solver on the three test cases in two‐ and three‐dimensions. By linking with the Intel Math Kernel Library (MKL) for executing dense linear algebra operations, we improve the previously obtained computation times, especially those for the employed parallel sparse direct solver MUMPS 9 (Multifrontal Massively Parallel sparse direct Solver). We obtained for example a speed‐up with a factor of more than 5 for the standalone MUMPS solver and of about 3 for GKB‐MUMPS for computations on two cores. We investigate the portability and present the performance of the algorithm on an Advanced Micro Devices (AMD) architecture. …”
Section: Introductionmentioning
confidence: 99%