Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms 2019
DOI: 10.1137/1.9781611975482.171
|View full text |Cite
|
Sign up to set email alerts
|

High-Dimensional Robust Mean Estimation in Nearly-Linear Time

Abstract: We study the fundamental problem of high-dimensional mean estimation in a robust model where a constant fraction of the samples are adversarially corrupted. Recent work gave the first polynomial time algorithms for this problem with dimension-independent error guarantees for several families of structured distributions.In this work, we give the first nearly-linear time algorithms for high-dimensional robust mean estimation. Specifically, we focus on distributions with (i) known covariance and sub-gaussian tail… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
132
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 70 publications
(132 citation statements)
references
References 23 publications
0
132
0
Order By: Relevance
“…In [22], the authors focused on the problem of high-dimensional linear regression in a robust model where an ε-fraction of the samples can be adversarially corrupted. Robust regression problems have also been studied in [15,21,38,6]. Above-mentioned articles assume corruption both in the design and the label.…”
Section: Related Literaturementioning
confidence: 99%
“…In [22], the authors focused on the problem of high-dimensional linear regression in a robust model where an ε-fraction of the samples can be adversarially corrupted. Robust regression problems have also been studied in [15,21,38,6]. Above-mentioned articles assume corruption both in the design and the label.…”
Section: Related Literaturementioning
confidence: 99%
“…This guarantee is still sub-optimal compared to the optimal sub-Gaussian rate [35]. Existing polynomial time algorithms having optimal performance are for the batch setting and require either storing the entire dataset [6,5,36,32,11] or have O(p log(1/δ)) storage complexity [8]. On the other hand, we argue that trading off some statistical accuracy for a large savings in memory and computation is favorable in practice.…”
Section: Corollary 3 (Streaming Heavy-tailed Mean Estimationmentioning
confidence: 92%
“…In the batch-setting, Hopkins [27] proposed a robust mean estimator that can be computed in polynomial time and which matches the error guarantees achieved by the empirical mean on Gaussian data. After this work, efficient algorithms with improved asymptotic runtimes were given in [10,6,5,32,8]. Related works [43,38] provide more practical and computationally-efficient algorithms with slightly sub-optimal rates.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations