2020
DOI: 10.48550/arxiv.2007.06114
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Simultaneous Feature Selection and Outlier Detection with Optimality Guarantees

Abstract: Sparse estimation methods capable of tolerating outliers have been broadly investigated in the last decade. We contribute to this research considering high-dimensional regression problems contaminated by multiple mean-shift outliers which affect both the response and the design matrix. We develop a general framework for this class of problems and propose the use of mixed-integer programming to simultaneously perform feature selection and outlier detection with provably optimal guarantees. We characterize the t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
17
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
2

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(17 citation statements)
references
References 60 publications
(110 reference statements)
0
17
0
Order By: Relevance
“…However, since we rely on nonconcave penalization methods, our proposal satisfies oracle properties under weaker assumptions compared to existing robust estimators based on convex penalties (Kurnaz et al 2017;Alfons et al 2013). This provides an important bridge between the latter and L 0 -constrained formulations with optimality guarantees (Insolia et al 2020). Moreover, unlike "soft" trimming estimators which produce a general down-weighting for all points (Loh 2017;Smucler and Yohai 2017;Chang et al 2018;Freue et al 2019;Amato et al 2021), our proposal is effective in estimating full weights for non-outlying observations.…”
Section: Introductionmentioning
confidence: 97%
See 1 more Smart Citation
“…However, since we rely on nonconcave penalization methods, our proposal satisfies oracle properties under weaker assumptions compared to existing robust estimators based on convex penalties (Kurnaz et al 2017;Alfons et al 2013). This provides an important bridge between the latter and L 0 -constrained formulations with optimality guarantees (Insolia et al 2020). Moreover, unlike "soft" trimming estimators which produce a general down-weighting for all points (Loh 2017;Smucler and Yohai 2017;Chang et al 2018;Freue et al 2019;Amato et al 2021), our proposal is effective in estimating full weights for non-outlying observations.…”
Section: Introductionmentioning
confidence: 97%
“…The MSOM assumes that outlying cases have a shift in mean; maximum likelihood estimation (MLE) leads to their removal from the fit -i.e., to the assignment of 0 weights to the cases identified as outliers. While the MSOM was traditionally studied in low-dimensional scenarios , it has been recently extended to high-dimensional linear models, where the use of regularization techniques is fundamental (She and Owen 2011;Alfons et al 2013;Kurnaz et al 2017;Insolia et al 2020). The VIOM, which is historically considered as an alternative to the MSOM, assumes that contaminated errors have an inflated variance; outliers are retained but down-weighted in the fit.…”
Section: Introductionmentioning
confidence: 99%
“…Under a very general framework allowing for different loss and penalty functions, She et al (2021) establish minimax bounds for the estimation error under the MSOM and propose an efficient algorithm for estimation. In a similar sprit, Insolia et al (2020) propose a mixed integer program (MIP) to constrain both the number of outliers and the number of relevant predictors using the L 0 -pseudo-norm. The authors develop guarantees for the algorithmic complexity and the statistical estimation error for this MIP under the MSOM, even for ultra-high dimensional problems.…”
Section: Introductionmentioning
confidence: 99%
“…For lower signal strengths, however, the L 0 -pseudo-norm for either variable selection or outlier detection tends to suffer from high variability. For variable selection and prediction, for example, other penalties may deliver better performance (Hastie et al 2020;Insolia et al 2020). In addition, for computational tractability and stability, it is necessary to restrict the optimization to a tight neighborhood around the true regression parameters, as well as to have a good understanding of the number of relevant predictors and the number of outliers.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation