We i n v estigate several ways to improve the performance of sparse LU factorization with partial pivoting, as used to solve unsymmetric linear systems. To perform most of the numerical computation in dense matrix kernels, we i n troduce the notion of unsymmetric supernodes. To better exploit the memory hierarchy, w e i n troduce unsymmetric supernode-panel updates and two-dimensional data partitioning. To speed up symbolic factorization, we use Gilbert and Peierls's depth-rst search with Eisenstat and Liu's symmetric structural reductions. We h a v e implemented a sparse LU code using all these ideas. We present experiments demonstrating that it is signi cantly faster than earlier partial pivoting codes. We also compare performance with Umfpack, which uses a multifrontal approach; our code is usually faster.Keywords: sparse matrix algorithms; unsymmetric linear systems; supernodes; column elimination tree; partial pivoting. AMSMOS subject classi cations: 65F05, 65F50. Computing Reviews descriptors: G.1.3 Numerical Analysis : Numerical Linear Algebra | Linear systems direct and iterative methods, Sparse and very large systems.
The most widely used ordering scheme to reduce fills and operations in sparse matrix computation is the minimum-degree algorithm. The notion of
multiple elimination
is introduced here as a modification to the conventional scheme. The motivation is discussed using the
k
-by-
k
grid model problem. Experimental results indicate that the modified version retains the fill-reducing property of (and is often better than) the original ordering algorithm and yet requires less computer time. The reduction in ordering time is problem dependent, and for some problems the modified algorithm can run a few times faster than existing implementations of the minimum-degree algorithm. The use of
external degree
in the algorithm is also introduced.
Two techniques are introduced to reduce the working storage requirement for the recent multifrontal method of Duff and Reid used in the sparse out-of-core factorization of symmetric matrices. For a given core size, the reduction in working storage allows some large problems to be solved without having to use auxiliary storage for the working arrays. Even if the working arrays exceed the core size, it will reduce the amount of input-output traffic necessary to manipulate the working vectors. Experimental results are provided to demonstrate significant storage reduction on practical problems using the proposed techniques.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.