2002
DOI: 10.1142/s0129626402000847
|View full text |Cite
|
Sign up to set email alerts
|

Trading Replication for Communication in Parallel Distributed-Memory Dense Solvers

Abstract: We present new communication-efficient parallel dense linear solvers: a solver for triangular linear systems with multiple right-hand sides and an LU factorization algorithm. These solvers are highly parallel and they perform a factor of 0.4P 1/6 less communication than existing algorithms, where P is number of processors. The new solvers reduce communication at the expense of using more temporary storage. Previously, algorithms that reduce communication by using more memory were only known for matrix multipli… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2003
2003
2023
2023

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(3 citation statements)
references
References 10 publications
0
3
0
Order By: Relevance
“…Similar asymptotic optimality results are proven for the 3D matrix multiplication [7] and 2.5D matrix multiplication and LU factorization [15]. An early implementation of fully 3D distributed algorithms for triangular solve and LU factorization without pivoting is proposed in [16] that are asymptoticaly optimal for the total volume of communications. The 2.5D algorithms bring a continuum between 2D and 3D algorithms where the trade-off between memory footprint and communication is controlled by a parameter.…”
Section: Related Workmentioning
confidence: 52%
“…Similar asymptotic optimality results are proven for the 3D matrix multiplication [7] and 2.5D matrix multiplication and LU factorization [15]. An early implementation of fully 3D distributed algorithms for triangular solve and LU factorization without pivoting is proposed in [16] that are asymptoticaly optimal for the total volume of communications. The 2.5D algorithms bring a continuum between 2D and 3D algorithms where the trade-off between memory footprint and communication is controlled by a parameter.…”
Section: Related Workmentioning
confidence: 52%
“…Parallel TRSM algorithms with 3D processor grids can reduce the communication cost in an analogous fashion to matrix multiplication. Irony and Toledo [20] presented the first parallelization of the recursive TRSM algorithm with a 3D processor grid. They demonstrated that the communication volume of their parallelization is O(nkp 1/3 + n 2 p 1/3 ).…”
Section: ) Recursive Triangular Matrix Solve For Multiple Rightmentioning
confidence: 99%
“…A first challenge to 2DBC was brought by the development of 3D and 2.5D algorithms [1], [2]. These algorithms are based on a 3D representation of dense linear algebra algorithms whose complexity is in O(N 3 ) where N is the size of the matrix (products, dense factorizations).…”
Section: Introductionmentioning
confidence: 99%