Using Additive Modifications in LU Factorization Instead of Pivoting

Lindquist, Neil; Łuszczek, Piotr; Dongarra, Jack

doi:10.1145/3577193.3593731

Proceedings of the 37th International Conference on Supercomputing 2023

DOI: 10.1145/3577193.3593731

|View full text |Cite

Using Additive Modifications in LU Factorization Instead of Pivoting

Neil Lindquist

Piotr Łuszczek

Jack Dongarra

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

Publication Types

Select...

Article3

Relationship

Self Cite0

Independent3

Authors

Journals

Cited by 3 publications

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Mixed-precision pre-pivoting strategy for the LU factorization

Sahraneshinsamani,

Catalán,

Herrero

2024

J Supercomput

View full text Add to dashboard Cite

This paper investigates the efficient application of half-precision floating-point (FP16) arithmetic on GPUs for boosting LU decompositions in double (FP64) precision. Addressing the motivation to enhance computational efficiency, we introduce two novel algorithms: Pre-Pivoted LU (PRP) and Mixed-precision Panel Factorization (MPF). Deployed in both hybrid CPU-GPU setups and native GPU-only configurations, PRP identifies pivot lists through LU decomposition computed in reduced precision and subsequently reorders matrix rows in FP64 precision before executing LU decomposition without pivoting. Two variants of PRP, namely hPRP and xPRP, are introduced, differing in their computation of pivot lists in full half-precision or mixed half-single precision. The MPF algorithm generates FP64 LU factorization while internally utilizing hPRP for panel factorization, showcasing accuracy on par with standard DGETRF but with superior speed. The study further explores auxiliary functions required for the native mode implementation of PRP variants and MPF.

show abstract

Mixed-precision pre-pivoting strategy for the LU factorization

Sahraneshinsamani,

Catalán,

Herrero

2024

J Supercomput

View full text Add to dashboard Cite

show abstract

Generalizing Random Butterfly Transforms to Arbitrary Matrix Sizes

Lindquist,

Luszczek,

Dongarra

2024

ACM Trans. Math. Softw.

View full text Add to dashboard Cite

Parker and Lê introduced random butterfly transforms (RBTs) as a preprocessing technique to replace pivoting in dense LU factorization. Unfortunately, their FFT-like recursive structure restricts the dimensions of the matrix. Furthermore, on multi-node systems, efficient management of the communication overheads restricts the matrix’s distribution even more. To remove these limitations, we have generalized the RBT to arbitrary matrix sizes by truncating the dimensions of each layer in the transform. We expanded Parker’s theoretical analysis to generalized RBT, specifically that in exact arithmetic, Gaussian elimination with no pivoting will succeed with probability 1 after transforming a matrix with full-depth RBTs. Furthermore, we experimentally show that these generalized transforms improve performance over Parker’s formulation by up to 62 % while retaining the ability to replace pivoting. This generalized RBT is available in the SLATE numerical software library.

show abstract

Evolution of the SLATE linear algebra library

Gates,

Abdelfattah,

Akbudak

et al. 2024

The International Journal of High Performance Computing Applica

View full text Add to dashboard Cite

SLATE (Software for Linear Algebra Targeting Exascale) is a distributed, dense linear algebra library targeting both CPU-only and GPU-accelerated systems, developed over the course of the Exascale Computing Project (ECP). While it began with several documents setting out its initial design, significant design changes occurred throughout its development. In some cases, these were anticipated: an early version used a simple consistency flag that was later replaced with a full-featured consistency protocol. In other cases, performance limitations and software and hardware changes prompted a redesign. Sequential communication tasks were parallelized; host-to-host MPI calls were replaced with GPU device-to-device MPI calls; more advanced algorithms such as Communication Avoiding LU and the Random Butterfly Transform (RBT) were introduced. Early choices that turned out to be cumbersome, error prone, or inflexible have been replaced with simpler, more intuitive, or more flexible designs. Applications have been a driving force, prompting a lighter weight queue class, nonuniform tile sizes, and more flexible MPI process grids. Of paramount importance has been building a portable library that works across several different GPU architectures – AMD, Intel, and NVIDIA – while keeping a clean and maintainable codebase. Here we explore the evolving design choices and their effects, both in terms of performance and software sustainability.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Using Additive Modifications in LU Factorization Instead of Pivoting

Cited by 3 publications

References 28 publications

Mixed-precision pre-pivoting strategy for the LU factorization

Mixed-precision pre-pivoting strategy for the LU factorization

Generalizing Random Butterfly Transforms to Arbitrary Matrix Sizes

Evolution of the SLATE linear algebra library

Contact Info

Product

Resources

About