2020
DOI: 10.48550/arxiv.2010.12058
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

An overview of block Gram-Schmidt methods and their stability properties

Abstract: Block Gram-Schmidt algorithms comprise essential kernels in many scientific computing applications, but for many commonly used variants, a rigorous treatment of their stability properties remains open. This survey provides a comprehensive categorization of block Gram-Schmidt algorithms, especially those used in Krylov subspace methods to build orthonormal bases one block vector at a time. All known stability results are assembled, and new results are summarized or conjectured for important communication-reduci… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
2

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 42 publications
0
4
0
Order By: Relevance
“…Our numerical results have demonstrated that DCGS2 obtains the same loss of orthogonality and representation error as CGS2, while our strong-scaling results on the Summit supercomputer indicate that DCGS2 obtains a speedup of 2× faster compute time on a single GPU, and an even larger speedup on an increasing number of GPUs, reaching 2.2× lower execution times on 192 GPUs. The impact of DCGS2 on the strong scaling of Krylov linear system solvers is currently being explored, and a block variant is also being implemented following the review article of Carson et al [15]. The software employed for this paper is available on GitHub.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Our numerical results have demonstrated that DCGS2 obtains the same loss of orthogonality and representation error as CGS2, while our strong-scaling results on the Summit supercomputer indicate that DCGS2 obtains a speedup of 2× faster compute time on a single GPU, and an even larger speedup on an increasing number of GPUs, reaching 2.2× lower execution times on 192 GPUs. The impact of DCGS2 on the strong scaling of Krylov linear system solvers is currently being explored, and a block variant is also being implemented following the review article of Carson et al [15]. The software employed for this paper is available on GitHub.…”
Section: Discussionmentioning
confidence: 99%
“…This is achieved by lagging the normalization as originally proposed by Kim and Chronopoulos [14]) and then applying Stephen's trick. The Pythagorean trick introduced by Smoktunowicz et al [8] avoids cancellation errors and Carson et al [15] generalize this to block Gram-Schmidt algorithms. The delayed normalization for the Arnoldi iteration was employed by Hernandez et al [1] without a correction.…”
Section: Low-synch Gram-schmidt Algorithmsmentioning
confidence: 99%
“…In the first case, the condition number of Q can grow as cond(W) 2 or even worse and thus requires special treatment [8]. While in the second case, cond(Q) can grow as cond(W) max 1≤j≤p cond(W (j) ) unless Step 2 is unconditionally stable [6,9]. The stability of these processes can be improved by re-orthogonalization, i.e., by running the inner loop twice.…”
Section: Algorithm 1 Block Gram-schmidt Processmentioning
confidence: 99%
“…It is also used in s-step, enlarged and other communication-avoiding Krylov subspace methods [12,14]. Please see [9] and the references therein for an extensive overview of BGS variants, and [2,17,18,22] for the underlying block Krylov methods.…”
Section: Introductionmentioning
confidence: 99%