High-dimensional generalized linear models and the lasso

Geer, Sara A. van de

doi:10.1214/009053607000000929

Cited by 521 publications

(476 citation statements)

References 23 publications

Supporting

Mentioning

459

Contrasting

Order By: Relevance

“…Motivated by many practical prediction problems, including those that arise in microarray data analysis and natural language processing, this problem has been extensively studied in recent years. The results can be divided into two categories: those that study the predictive power ofβ [9,30,12] and those that study its sparsity pattern and reconstruction properties [4,32,18,19,17,8]; this article falls into the first of these categories.…”

Section: Introductionmentioning

confidence: 99%

ℓ 1-regularized linear regression: persistence and oracle inequalities

Bartlett

Mendelson

Neeman

2011

Probab. Theory Relat. Fields

View full text Add to dashboard Cite

We study the predictive performance of 1 -regularized linear regression, including the case where the number of covariates is substantially larger than the sample size. We introduce a new analysis method that does not require uniformly bounded covariates, an assumption that was often necessary with previous techniques. This technique provides an answer to a conjecture of Greenshtein and Ritov [12] regarding the "persistence" rate for linear regression and allows us to prove an oracle inequality for the error of the regularized minimizer.

show abstract

Section: Introductionmentioning

confidence: 99%

ℓ 1-regularized linear regression: persistence and oracle inequalities

Bartlett

Mendelson

Neeman

2011

Probab. Theory Relat. Fields

View full text Add to dashboard Cite

show abstract

“…[2,6,10,13,20,19] demonstrated the fundamental result that ℓ 1 -penalized least squares estimators achieve the rate s/n √ log p, which is very close to the oracle rate s/n achievable when the true model is known. [17] demonstrated a similar fundamental result on the excess forecasting error loss under both quadratic and non-quadratic loss functions. Thus the estimator can be consistent and can have excellent forecasting performance even under very rapid, nearly exponential growth of the total number of regressors p. [1] investigated the ℓ 1 -penalized quantile regression process, obtaining similar results.…”

Section: Introductionmentioning

confidence: 57%

“…Several papers have begun to investigate estimation of HDSMs, primarily focusing on penalized mean regression, with the ℓ 1 -norm acting as a penalty function [2,6,10,13,17,20,19]. [2,6,10,13,20,19] demonstrated the fundamental result that ℓ 1 -penalized least squares estimators achieve the rate s/n √ log p, which is very close to the oracle rate s/n achievable when the true model is known.…”

Section: Introductionmentioning

confidence: 99%

Post-l1-penalized estimators in high-dimensional linear regression models

Belloni¹,

Chernozhukov²

2010

View full text Add to dashboard Cite

Abstract. In this paper we study post-penalized estimators which apply ordinary, unpenalized linear regression to the model selected by first-step penalized estimators, typically LASSO.It is well known that LASSO can estimate the regression function at nearly the oracle rate, and is thus hard to improve upon. We show that post-LASSO performs at least as well as LASSO in terms of the rate of convergence, and has the advantage of a smaller bias. Remarkably, this performance occurs even if the LASSO-based model selection "fails" in the sense of missing some components of the "true" regression model. By the "true" model we mean here the best s-dimensional approximation to the regression function chosen by the oracle. Furthermore, post-LASSO can perform strictly better than LASSO, in the sense of a strictly faster rate of convergence, if the LASSO-based model selection correctly includes all components of the "true" model as a subset and also achieves a sufficient sparsity. In the extreme case, when LASSO perfectly selects the "true" model, the post-LASSO estimator becomes the oracle estimator. An important ingredient in our analysis is a new sparsity bound on the dimension of the model selected by LASSO which guarantees that this dimension is at most of the same order as the dimension of the "true" model. Our rate results are non-asymptotic and hold in both parametric and nonparametric models. Moreover, our analysis is not limited to the LASSO estimator in the first step, but also applies to other estimators, for example, the trimmed LASSO, Dantzig selector, or any other estimator with good rates and good sparsity. Our analysis covers both traditional trimming and a new practical, completely data-driven trimming scheme that induces maximal sparsity subject to maintaining a certain goodness-of-fit. The latter scheme has theoretical guarantees similar to those of LASSO or post-LASSO, but it dominates these procedures as well as traditional trimming in a wide variety of experiments.

show abstract

“…In addition to linear regression models, the idea of the penalized regressions has been broadly applied to various statistical models and problems; generalized linear models (Van de Geer, 2008), Cox proportional hazard models (Fan and Li, 2002), Gaussian graphical models (Friedman et al, 2013), principal component analysis (Park, 2013) and high-dimensional clustering problems (Kwon et al, 2013).…”

Section: Introductionmentioning

confidence: 99%

A note on standardization in penalized regressions

Lee¹

2015

Journal of the Korean Data and Information Science Society

View full text Add to dashboard Cite

We consider sparse high-dimensional linear regression models. Penalized regressions have been used as effective methods for variable selection and estimation in highdimensional models. In penalized regressions, it is common practice to standardize variables before fitting a penalized model and then fit a penalized model with standardized variables. Finally, the estimated coefficients from a penalized model are recovered to the scale on original variables. However, these procedures produce a slightly different solution compared to the corresponding original penalized problem. In this paper, we investigate issues on the standardization of variables in penalized regressions and formulate the definition of the standardized penalized estimator. In addition, we compare the original penalized estimator with the standardized penalized estimator through simulation studies and real data analysis.

show abstract

High-dimensional generalized linear models and the lasso

Cited by 521 publications

References 23 publications

ℓ 1-regularized linear regression: persistence and oracle inequalities

ℓ 1-regularized linear regression: persistence and oracle inequalities

Post-l1-penalized estimators in high-dimensional linear regression models

A note on standardization in penalized regressions

Contact Info

Product

Resources

About