2021
DOI: 10.1080/10618600.2020.1845184
|View full text |Cite
|
Sign up to set email alerts
|

MIP-BOOST: Efficient and Effective L0 Feature Selection for Linear Regression

Abstract: Because of continuous advances in mathematical programing, Mix Integer Optimization has become a competitive vis-a-vis popular regularization method for selecting features in regression problems. The approach exhibits unquestionable foundational appeal and versatility, but also poses important challenges. We tackle these challenges, reducing computational burden when tuning the sparsity bound (a parameter which is critical for effectiveness) and improving performance in the presence of feature collinearity and… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2021
2021
2025
2025

Publication Types

Select...
6
3

Relationship

3
6

Authors

Journals

citations
Cited by 15 publications
(14 citation statements)
references
References 30 publications
0
14
0
Order By: Relevance
“…We are also exploring more efficient tuning strategies for the sparsity and trimming levels, as well as the ridge-like parameter, if present. Utilizing approaches such as warm-starts or integrated cross-validation [56] can substantially reduce the computational burden for subsequent runs of the MIP algorithm, and allow better tuning. If the trimming level for MIProb is inflated, a re-weighting approach may also be included in order to increase the efficiency of the estimator as in [10], as well as approaches based on the forward search [45].…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…We are also exploring more efficient tuning strategies for the sparsity and trimming levels, as well as the ridge-like parameter, if present. Utilizing approaches such as warm-starts or integrated cross-validation [56] can substantially reduce the computational burden for subsequent runs of the MIP algorithm, and allow better tuning. If the trimming level for MIProb is inflated, a re-weighting approach may also be included in order to increase the efficiency of the estimator as in [10], as well as approaches based on the forward search [45].…”
Section: Discussionmentioning
confidence: 99%
“…This is natural, given methods like enetLTS are heuristics and avoid directly solving the full combinatorial problem. As discussed in more detail in [11,56], a common challenge with MIP formulations is the weak lower bound produced by the relaxed version of the problem. Thus, while the optimal solution may have already been found, the majority of computing time may be used to verify its optimality.…”
Section: Computational Detailsmentioning
confidence: 99%
“…A MIQP algorithm for the special case of feature selection (or sparse regression), 0 -cons( Ax − b 2 , k, R n ), was proposed in [45], including the aforementioned ways to compute tighter big-M bounds; some statistical properties of such sparse regression problems and relations to their regularized versions are discussed in, e.g., [348,298]. Other tweaks of the straightforward big-M MIQP approach are discussed in [210] (see also [14]). Introducing a ridge regularization term to the regression objective, [49] recast the problem as a binary convex optimization problem and propose an outer-approximation solution algorithm that scales to large dimensions, at least for sufficiently small k. It is also possible to recast such cardinality-constrained least-squares problems (with ridge penalty) as mixed-integer semidefinite programs (MISDPs), see [283,162], but those can only be solved exactly for small-scale instances, despite providing stronger relaxations.…”
Section: Cardinality-constrained Optimizationmentioning
confidence: 99%
“…This guarantees the achievability of high-breakdown estimates (see below). Modern MIP solvers can be used to solve the formulation in ( 6) with optimality guarantees (Bertsimas et al 2016;Insolia et al 2020;Kenney et al 2021). However, in order to reduce the computational burden, one can also use well-established heuristic algorithms (Alfons et al 2013;Kurnaz et al 2017).…”
Section: Conditions List 1 (Penalty Function)mentioning
confidence: 99%