We consider high-dimensional generalized linear models with Lipschitz loss
functions, and prove a nonasymptotic oracle inequality for the empirical risk
minimizer with Lasso penalty. The penalty is based on the coefficients in the
linear predictor, after normalization with the empirical norm. The examples
include logistic regression, density estimation and classification with hinge
loss. Least squares regression is also discussed.Comment: Published in at http://dx.doi.org/10.1214/009053607000000929 the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Oracle inequalities and variable selection properties for the Lasso in linear
models have been established under a variety of different assumptions on the
design matrix. We show in this paper how the different conditions and concepts
relate to each other. The restricted eigenvalue condition (Bickel et al., 2009)
or the slightly weaker compatibility condition (van de Geer, 2007) are
sufficient for oracle results. We argue that both these conditions allow for a
fairly general class of design matrices. Hence, optimality of the Lasso for
prediction and estimation holds for more general situations than what it
appears from coherence (Bunea et al, 2007b,c) or restricted isometry (Candes
and Tao, 2005) assumptions.Comment: 33 pages, 1 figur
As the dimensionality of the alternative hypothesis increases, the power of classical tests tends to diminish quite rapidly. This is especially true for high dimensional data in which there are more parameters than observations. We discuss a score test on a hyperparameter in an empirical Bayesian model as an alternative to classical tests. It gives a general test statistic which can be used to test a point null hypothesis against a high dimensional alternative, even when the number of parameters exceeds the number of samples. This test will be shown to have optimal power on average in a neighbourhood of the null hypothesis, which makes it a proper generalization of the locally most powerful test to multiple dimensions. To illustrate this new locally most powerful test we investigate the case of testing the global null hypothesis in a linear regression model in more detail. The score test is shown to have significantly more power than the "F"-test whenever under the alternative the large variance principal components of the design matrix explain substantially more of the variance of the outcome than do the small variance principal components. The score test is also useful for detecting sparse alternatives in truly high dimensional data, where its power is comparable with the test based on the maximum absolute "t"-statistic. Copyright 2006 Royal Statistical Society.
This paper is a selective review of the regularization methods scattered in statistics literature. We introduce a general conceptual approach to regularization and fit most existing methods into it. We have tried to focus on the importance of regularization when dealing with today's high-dimensional objects: data and models. A wide range of examples are discussed, including nonparametric regression, boosting, covariance matrix estimation, principal component estimation, subsampling.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.