2006
DOI: 10.1016/j.jpdc.2006.07.001
|View full text |Cite
|
Sign up to set email alerts
|

Parallel sparse LU factorization on different message passing platforms

Abstract: Several message passing-based parallel solvers have been developed for general (nonsymmetric) sparse LU factorization with partial pivoting. Existing solvers were mostly deployed and evaluated on parallel computing platforms with high message passing performance (e.g., 1-10 µs in message latency and 100-1000 Mbytes/sec in message throughput) while little attention has been paid on slower platforms. This paper investigates techniques that are specifically beneficial for LU factorization on platforms with slow m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

1
2
0

Year Published

2015
2015
2018
2018

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 31 publications
1
2
0
Order By: Relevance
“…This verifies the accuracy of BBCR solver. Secondly, the speed-up ratios of BBCR solver and MUMPS solver are compared in Fig 3. On the one hand, we can see that the MUMPS shows a poor speed up performance, which is consistent with relative research (Shen K, 2006). And it seems that the limit speed-up ratio of MUMPS for this problem is less than 2.…”
Section: Numerical Experimentssupporting
confidence: 87%
See 1 more Smart Citation
“…This verifies the accuracy of BBCR solver. Secondly, the speed-up ratios of BBCR solver and MUMPS solver are compared in Fig 3. On the one hand, we can see that the MUMPS shows a poor speed up performance, which is consistent with relative research (Shen K, 2006). And it seems that the limit speed-up ratio of MUMPS for this problem is less than 2.…”
Section: Numerical Experimentssupporting
confidence: 87%
“…It can not only simulate multi-shot seismic data efficiently , but also reduce the memory requirement and computing time. However, the speed-up performance of traditional parallel sparse LU solvers, such as MUMPS (Amestoy et al, 2006;MUMPS Team 2015) and superLU, are quite poor (Shen K, 2006). Therefore, new parallel direct solver is required to get better speed-up performance.…”
Section: Introductionmentioning
confidence: 99%
“…For example, Van der Stappen et al [32] present an algorithm for parallel calculation of the LU decomposition on a mesh network of transputers where each processor holds a part of the matrix. Shen [31] evaluates techniques for LU decomposition distributed over nodes that are connected via slow message passing. Dongarra et al [9] demonstrate an optimized implementation of matrix inversion on a single multicore node, focusing on the minimization of synchronization between the different processing cores.…”
Section: Foundations and Related Workmentioning
confidence: 99%