The paper proposes a method for constructing a sparse estimator for the inverse covariance (concentration) matrix in high-dimensional settings. The estimator uses a penalized normal likelihood approach and forces sparsity by using a lasso-type penalty. We establish a rate of convergence in the Frobenius norm as both data dimension $p$ and sample size $n$ are allowed to grow, and show that the rate depends explicitly on how sparse the true concentration matrix is. We also show that a correlation-based version of the method exhibits better rates in the operator norm. We also derive a fast iterative algorithm for computing the estimator, which relies on the popular Cholesky decomposition of the inverse but produces a permutation-invariant estimator. The method is compared to other estimators on simulated data and on a real data example of tumor tissue classification using gene expression data.Comment: Published in at http://dx.doi.org/10.1214/08-EJS176 the Electronic Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of Mathematical Statistics (http://www.imstat.org
We propose a new class of generalized thresholding operators that combine thresholding with shrinkage, and study generalized thresholding of the sample covariance matrix in high dimensions. Generalized thresholding of the covariance matrix has good theoretical properties and carries almost no computational burden. We obtain an explicit convergence rate in the operator norm that shows the tradeoff between the sparsity of the true model, dimension, and the sample size, and shows that generalized thresholding is consistent over a large class of models as long as the dimension p and the sample size n satisfy log p/n ! 0. In addition, we show that generalized thresholding has the ''sparsistency'' property, meaning it estimates true zeros as zeros with probability tending to 1, and, under an additional mild condition, is sign consistent for nonzero elements. We show that generalized thresholding covers, as special cases, hard and soft thresholding, smoothly clipped absolute deviation, and adaptive lasso, and compare different types of generalized thresholding in a simulation study and in an example of gene clustering from a microarray experiment with tumor tissues.
We propose a procedure for constructing a sparse estimator of a multivariate regression coefficient matrix that accounts for correlation of the response variables. This method, which we call multivariate regression with covariance estimation (MRCE), involves penalized likelihood with simultaneous estimation of the regression coefficients and the covariance structure. An efficient optimization algorithm and a fast approximation are developed for computing MRCE. Using simulation studies, we show that the proposed method outperforms relevant competitors when the responses are highly correlated. We also apply the new method to a finance example on predicting asset returns. An R-package containing this dataset and code for computing MRCE and its approximation are available online.
The paper proposes a new covariance estimator for large covariance matrices when the variables have a natural ordering. Using the Cholesky decomposition of the inverse, we impose a banded structure on the Cholesky factor, and select the bandwidth adaptively for each row of the Cholesky factor, using a novel penalty we call nested Lasso. This structure has more flexibility than regular banding, but, unlike regular Lasso applied to the entries of the Cholesky factor, results in a sparse estimator for the inverse of the covariance matrix. An iterative algorithm for solving the optimization problem is developed. The estimator is compared to a number of other covariance estimators and is shown to do best, both in simulations and on a real data example. Simulations show that the margin by which the estimator outperforms its competitors tends to increase with dimension.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with đź’™ for researchers
Part of the Research Solutions Family.