Covariance matrices are ubiquitous in computational science and engineering. In particular, large covariance matrices arise from multivariate spatial data sets, for instance, in climate/weather modeling applications to improve prediction using statistical methods and spatial data. One of the most time-consuming computational steps consists in calculating the Cholesky factorization of the symmetric, positive-definite covariance matrix problem. The structure of such covariance matrices is also often data-sparse, in other words, effectively of low rank, though formally dense. While not typically globally of low rank, covariance matrices in which correlation decays with distance are nearly always hierarchically of low rank. While symmetry and positive definiteness should be, and nearly always are, exploited for performance purposes, exploiting low rank character in this context is very recent, and will be a key to solving these challenging problems at large-scale dimensions. The authors design a new and flexible tile row rank Cholesky factorization and propose a high performance implementation using OpenMP taskbased programming model on various leading-edge manycore architectures. Performance comparisons and memory footprint saving on up to 200K × 200K covariance matrix size show a gain of more than an order of magnitude for both metrics, against state-of-the-art open-source and vendor optimized numerical libraries, while preserving the numerical accuracy fidelity of the original model. This research represents an important milestone in enabling large-scale simulations for covariancebased scientific applications.
We introduce a definition of the volume for a general rectangular matrix,
which for square matrices is equivalent to the absolute value of the
determinant. We generalize results for square maximum-volume submatrices to the
case of rectangular maximal-volume submatrices, show connection of the
rectangular volume with optimal experimental design and provide estimates for
the growth of the coefficients and approximation error in spectral and
Chebyshev norms. Three promising applications of such submatrices are
presented: recommender systems, finding maximal elements in low-rank matrices
and preconditioning of overdetermined linear systems. The code is available
online.Comment: 29 pages, 1 figure, 3 tables, submitted to Linear Algebra and its
Application
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.