In this paper we tackle the inversion of large-scale dense matrices via conventional matrix factorizations (LU, Cholesky, LDL T ) and the Gauss-Jordan method on hybrid platforms consisting of a multi-core CPU and a many-core graphics processor (GPU). Specifically, we introduce the different matrix inversion algorithms using a unified framework based on the notation from the FLAME project; we develop hybrid implementations for those matrix operations underlying the algorithms, alternative to those in existing libraries for single-GPU systems; and we perform an extensive experimental study on a platform equipped with state-of-the-art general-purpose architectures from Intel and a "Fermi" GPU from NVIDIA that exposes the efficiency of the different inversion approaches. Our study and experimental results show the simplicity and performance advantage of the GJE-based inversion methods, and the difficulties associated with the symmetric indefinite case.
The solution of linear systems is a recurrent operation in scientific and engineering applications, traditionally addressed via the LU factorization. The Gauss-Huard (GH) algorithm has been introduced as an efficient alternative in modern platforms equipped with accelerators, although this approach presented some functional constraints. In particular, it was not possible to reuse part of the computations in the solution of delayed linear systems or in the inversion of the matrix. Here, we adapt GH to overcome these two deficiencies of GH, yielding new algorithms that exhibit the same computational cost as their corresponding counterparts based on the LU factorization of the matrix. We evaluate the novel GH extensions on the solution of Lyapunov matrix equations via the LRCF-ADI method, validating our approach via experiments with three benchmarks from model order reduction. Figure 2. Blocked Gauss-Huard (GH) for the solution of Ax D b. On entry, O A D OEA; b, and upon completion, the last column of O A is overwritten with the solution x.Figure 3. Unblocked algorithm for the reutilization of Gauss-Huard (GH) in the solution of Ax D b. On entry, N A is the matrix resulting from the application of the GH algorithm to A, and upon completion, b is overwritten with the solution x.A related problem appears when A has been employed to solve a linear system via GH, and therefore, its contents have been overwritten with the factorization, and the inverse of this matrix is required next. Under special circumstances, this scenario can be of interest, for example, to avoid explicit multiplication with the matrix inverse in the solution of Lyapunov equations via the matrix sign function [10].
We investigate the numerical computation of the matrix sign function of large-scale dense matrices. This is a common task in various application areas. The main computational work in Newton's iteration for the matrix sign function consits of matrix inversion. Therefore, we investigate the performance of two approaches for matrix inversion based on Gaussian (LU factorization) and Gauss-Jordan eliminations. The target architecture is a current general-purpose multi-core processor connected to a graphics processor. Parallelism is extracted in both processors by linking sequential versions of the codes with multi-threaded implementations of BLAS. Our results on a system with two Intel Quad-Core processors and an nvidia Tesla C1060 illustrate the performance and scalability attained by the codes on this system.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.