Some theoretical results on the Grouped Variables Lasso

Chesneau, Christophe; Hebiri, Mohamed

doi:10.3103/s1066530708040030

Cited by 33 publications

(25 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Our techniques of proofs build upon and extend those in these papers. Several papers analyzing statistical properties of the Group Lasso estimator appeared quite recently [3,10,16,19,24,25,26,31]. Most of them are focused on the Group Lasso for additive models [16,19,25,31] or generalized linear models [24].…”

Section: Previous Workmentioning

confidence: 99%

Oracle inequalities and optimal inference under group sparsity

Lounici¹,

Pontil²,

Geer³

et al. 2011

Ann. Statist.

302

367

View full text Add to dashboard Cite

We consider the problem of estimating a sparse linear regression vector β * under a gaussian noise model, for the purpose of both prediction and model selection. We assume that prior knowledge is available on the sparsity pattern, namely the set of variables is partitioned into prescribed groups, only few of which are relevant in the estimation process. This group sparsity assumption suggests us to consider the Group Lasso method as a means to estimate β * . We establish oracle inequalities for the prediction and ℓ 2 estimation errors of this estimator. These bounds hold under a restricted eigenvalue condition on the design matrix. Under a stronger coherence condition, we derive bounds for the estimation error for mixed (2, p)-norms with 1 ≤ p ≤ ∞. When p = ∞, this result implies that a threshold version of the Group Lasso estimator selects the sparsity pattern of β * with high probability. Next, we prove that the rate of convergence of our upper bounds is optimal in a minimax sense, up to a logarithmic factor, for all estimators over a class of group sparse vectors. Furthermore, we establish lower bounds for the prediction and ℓ 2 estimation errors of the usual Lasso estimator. Using this result, we demonstrate that the Group Lasso can achieve an improvement in the prediction and estimation properties as compared to the Lasso.An important application of our results is provided by the problem of estimating multiple regression equation simultaneously or multi-task learning. In this case, our results lead to refinements of the results in [22] and allow one to establish the quantitative advantage of the Group Lasso over the usual Lasso in the multi-task setting. Finally, within the same setting, we show how our results can be extended to more general noise distributions, of which we only require the fourth moment to be finite. To obtain this extension, we establish a new maximal moment inequality, which may be of independent interest. 1 The phrase "β * is sparse" means that most of the components of this vector are equal to zero. 1 problem is relevant range from multi-task learning [2, 23, 28] and conjoint analysis [14,20] to longitudinal data analysis [11] as well as the analysis of panel data [15,38], among others. We briefly review these different settings in the course of the paper. In particular, multi-task learning provides a main motivation for our study. In that setting each regression equation corresponds to a different learning task; in addition to the requirement that M ≫ n, we also allow for the number of tasks T to be much larger than n. Following [2] we assume that there are only few common important variables which are shared by the tasks. That is, we assume that the vectors β * 1 , . . . , β * T are not only sparse but also have their sparsity patterns included in the same set of small cardinality. This group sparsity assumption induces a relationship between the responses and, as we shall see, can be used to improve estimation. The model (1.2) can be reformulated as a single regression problem of th...

show abstract

Section: Previous Workmentioning

confidence: 99%

Oracle inequalities and optimal inference under group sparsity

Lounici¹,

Pontil²,

Geer³

et al. 2011

Ann. Statist.

302

367

View full text Add to dashboard Cite

show abstract

“…Other methods have been introduced that have concave and continuous differentiable penalties, such as a smoothly clipped absolute deviation (SCAD) penalty defined as

p_{italicλ} false(false| italicβ false| false) = {normalΣ}_{j = 1}^{p} italicλ false[I false(false| {italicβ}_{j} false| \leq italicλ false) + I false(false| {italicβ}_{j} false| > italicλ false) (a λ - | β_{j} {false| false)}_{+} false/ {false(a - 1 false) λ} false] for some a > 2

for some a > 2 (Fan & Li ) and a minimax concave penalty (MCP) with

p_{italicλ} false(false| italicβ false| false) = {normalΣ}_{j = 1}^{p} false(a italicλ - false| {italicβ}_{j} {|)}_{+} / a

(Zhang ). Most of these methods are consistent when the sample size is much larger than the number of parameters (Zhao & Yu ; Meinshausen & Bühlmann ; Bunea, Tsybakov & Wegkamp ; Huang, Ma & Zhang ; Chesneau & Hebiri ; Wainwright ).…”

Section: Introductionmentioning

confidence: 99%

“…Other methods have been introduced that have concave and continuous differentiable penalties, such as a smoothly clipped absolute deviation (SCAD) penalty defined as p (| |) = p j=1 [I (| j | ) + I (| j | > )(a − | j |) + ={(a − 1) }] for some a > 2 for some a > 2 (Fan & Li 2001) and a minimax concave penalty (MCP) with p (| |) = p j=1 (a − | j |) + =a (Zhang 2010). Most of these methods are consistent when the sample size is much larger than the number of parameters (Zhao & Yu 2006;Meinshausen & Bühlmann 2006;Bunea, Tsybakov & Wegkamp 2007;Huang, Ma & Zhang 2008;Chesneau & Hebiri 2008;Wainwright 2009).…”

Section: Introductionmentioning

confidence: 99%

Multiple Hypothesis Testing for Variable Selection

Rohart

2016

Aus NZ J of Statistics

View full text Add to dashboard Cite

Summary We propose two new procedures based on multiple hypothesis testing for correct support estimation in high‐dimensional sparse linear models. We conclusively prove that both procedures are powerful and do not require the sample size to be large. The first procedure tackles the atypical setting of ordered variable selection through an extension of a testing procedure previously developed in the context of a linear hypothesis. The second procedure is the main contribution of this paper. It enables data analysts to perform support estimation in the general high‐dimensional framework of non‐ordered variable selection. A thorough simulation study and applications to real datasets using the R package mht shows that our non‐ordered variable procedure produces excellent results in terms of correct support estimation as well as in terms of mean square errors and false discovery rate, when compared to common methods such as the Lasso, the SCAD penalty, forward regression or the false discovery rate procedure (FDR).

show abstract

“…Those results lead to the refinements of their previous results for multi-task learning (see [17]). The behavior of the Lasso and Group Lasso regarding their selection and estimation properties have been studied in : [16,24,40,29,39,25] for Lasso in linear regression; [9,26] for Group Lasso in linear regression; [31,22,13] for additive models. Few results on the Lasso and Group Lasso concern logistic regression model.…”

Section: Introductionmentioning

confidence: 99%

Non-asymptotic oracle inequalities for the Lasso and Group Lasso in high dimensional logistic model

Kwemou

2016

ESAIM: PS

View full text Add to dashboard Cite

We consider the problem of estimating a function f 0 in logistic regression model.We propose to estimate this function f 0 by a sparse approximation build as a linear combination of elements of a given dictionary of p functions. This sparse approximation is selected by the Lasso or Group Lasso procedure. In this context, we state non asymptotic oracle inequalities for Lasso and Group Lasso under restricted eigenvalue assumption as introduced in [4].

show abstract

Some theoretical results on the Grouped Variables Lasso

Cited by 33 publications

References 22 publications

Oracle inequalities and optimal inference under group sparsity

Oracle inequalities and optimal inference under group sparsity

Multiple Hypothesis Testing for Variable Selection

Non-asymptotic oracle inequalities for the Lasso and Group Lasso in high dimensional logistic model

Contact Info

Product

Resources

About