2017
DOI: 10.1214/16-sts586
|View full text |Cite
|
Sign up to set email alerts
|

On the Sensitivity of the Lasso to the Number of Predictor Variables

Abstract: Abstract. The Lasso is a computationally efficient regression regularization procedure that can produce sparse estimators when the number of predictors (p) is large. Oracle inequalities provide probability loss bounds for the Lasso estimator at a deterministic choice of the regularization parameter. These bounds tend to zero if p is appropriately controlled, and are thus commonly cited as theoretical justification for the Lasso and its ability to handle high-dimensional settings. Unfortunately, in practice the… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
8
2

Relationship

1
9

Authors

Journals

citations
Cited by 13 publications
(10 citation statements)
references
References 34 publications
0
10
0
Order By: Relevance
“…Although various loss bounds have been derived that support the use of the lasso for a deterministic choice of the regularization parameter, in practice, the tuning parameter is chosen data dependently with good reasons. As pointed out in the work of Flynn et al (2014), the loss of the lasso when using data-dependent tuning parameters and without knowing which variables have non-zero coefficients as compared to the loss obtained for the true sparse model is much larger than suggested by oracle inequalities. They demonstrate the effect for categorical predictors in a small simulation study.…”
Section: Discussion By J Chiquet Y Grandvalet and G Rigaillmentioning
confidence: 95%
“…Although various loss bounds have been derived that support the use of the lasso for a deterministic choice of the regularization parameter, in practice, the tuning parameter is chosen data dependently with good reasons. As pointed out in the work of Flynn et al (2014), the loss of the lasso when using data-dependent tuning parameters and without knowing which variables have non-zero coefficients as compared to the loss obtained for the true sparse model is much larger than suggested by oracle inequalities. They demonstrate the effect for categorical predictors in a small simulation study.…”
Section: Discussion By J Chiquet Y Grandvalet and G Rigaillmentioning
confidence: 95%
“…Even though one of the analyzed data sets is identical in both studies, there are (minor) discrepancies between the results of the two studies due to differences in study design and implementation details. For example, it is well-known that the results of LASSO depend on the number of variables [39], so it is most likely that the pre-selection step (not performed in De Bin et al [6]) changes the results noticeably. Other discrepancies may be related to the use of a FP function to model age and to differences in the standardization step when performing boosting/lasso (in De Bin et al [6] the standardization was performed on the whole training set, while we standardized the variables on the single subsamples).…”
Section: Discussionmentioning
confidence: 99%
“…Formal proofs for the case of diagonal loading were provided by Pajovic [129,Chapter 3], wherein the optimum diagonal loading coefficient δ 2 is predicted using Random Matrix Theory and the sensitivity of the optimum diagonal loading to the physics of the problem is analyzed for array processing. Similarly, it has been shown [62] that when δ 1 of (2.16) is chosen in a data-dependent manner, the performance of the LASSO estimator can rapidly deteriorate in practice. Indeed, experience also bears this out, as a poor choice of parameter can cause regularized solution to perform worse than expected.…”
Section: The Need For Alternate Approachesmentioning
confidence: 99%