We derive a CUR approximate matrix factorization based on the Discrete Empirical Interpolation Method (DEIM). For a given matrix A, such a factorization provides a low rank approximate decomposition of the form A ≈ CUR, where C and R are subsets of the columns and rows of A, and U is constructed to make CUR a good approximation. Given a low-rank singular value decomposition A ≈ VSW T , the DEIM procedure uses V and W to select the columns and rows of A that form C and R. Through an error analysis applicable to a general class of CUR factorizations, we show that the accuracy tracks the optimal approximation error within a factor that depends on the conditioning of submatrices of V and W. For very large problems, V and W can be approximated well using an incremental QR algorithm that makes only one pass through A. Numerical examples illustrate the favorable performance of the DEIM-CUR method compared to CUR approximations based on leverage scores.
IntroductionThis work presents a new CUR matrix factorization based upon the Discrete Empirical Interpolation Method (DEIM). A CUR factorization is a low rank approximation of a matrix A ∈ R m×n of the form A ≈ CUR, where C = A(:, q) ∈ R m×k is a subset of the columns of A and R = A(p, :) ∈ R k×n is a subset of the rows of A.(We generally assume m ≥ n throughout.) The k × k matrix U is constructed to assure that CUR is a good approximation to A. Assuming the best rank-k singular value decomposition (SVD) A ≈ VSW T is available, the algorithm uses the DEIM index selection procedure, q = DEIM(V) and p = DEIM(W), to determine C and R. The resulting approximate factorization is nearly as accurate as the best rank-k SVD, withwhere σ k+1 is the first neglected singular value of A, η p ≡ V(p, : ) −1 , and η q ≡ W(q, : ) −1 .Here and throughout, · denotes the vector 2-norm and the matrix norm it induces, and · F is the Frobenius norm. We use MATLAB notation to index vectors and matrices, so that, e.g., A(p, :) denotes the k rows of A *