Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures 2022
DOI: 10.1145/3490148.3538587
|View full text |Cite
|
Sign up to set email alerts
|

I/O-Optimal Algorithms for Symmetric Linear Algebra Kernels

Abstract: In this paper, we consider two fundamental symmetric kernels in linear algebra: the Cholesky factorization and the symmetric rankupdate (SYRK), with the classical three nested loops algorithms for these kernels. In addition, we consider a machine model with a fast memory of size and an unbounded slow memory. In this model, all computations must be performed on operands in fast memory, and the goal is to minimize the amount of communication between slow and fast memories. As the set of computations is fixed by … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
25
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5

Relationship

3
2

Authors

Journals

citations
Cited by 6 publications
(25 citation statements)
references
References 14 publications
0
25
0
Order By: Relevance
“…Based on a discrete version of classical Loomis-Whitney projection argument developed in [12], Olivry et al establish in [8], that Cholesky factorization requires at least n 3 6 √ M data transfers. A specific analysis adapted to the symmetric nature of this factorization allowed Beaumont et al [13] to further improve the bound and obtain an optimal value of…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…Based on a discrete version of classical Loomis-Whitney projection argument developed in [12], Olivry et al establish in [8], that Cholesky factorization requires at least n 3 6 √ M data transfers. A specific analysis adapted to the symmetric nature of this factorization allowed Beaumont et al [13] to further improve the bound and obtain an optimal value of…”
Section: Related Workmentioning
confidence: 99%
“…In 2009, Béreux [14] proposed a sequential out-of-core Cholesky algorithm with "narrow blocks" that performs at most n 3 3 √ M + O(n 2 ) data transfers. The authors of [13] have improved this result with an algorithm which performs…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…The SBC distribution (Symmetric Block Cyclic) was introduced recently [3]. It is valid for specific values of P , of the form either a 2 /2 for an even integer a, or a(a − 1)/2 for any integer a. SBC induces a communication volume lower by a factor of √ 2 than 2DBC, but however remains within a factor of √ 2 of the lower bound [8]. In this paper, we propose an extension of this symmetric distribution to all possible values of P .…”
Section: Introductionmentioning
confidence: 99%