2020
DOI: 10.48550/arxiv.2001.07805
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

When does the Tukey median work?

Abstract: We analyze the performance of the Tukey median estimator under total variation (TV) distance corruptions. Previous results show that under Huber's additive corruption model, the breakdown point is 1/3 for high-dimensional halfspace-symmetric distributions. We show that under TV corruptions, the breakdown point reduces to 1/4 for the same set of distributions. We also show that a certain projection algorithm can attain the optimal breakdown point of 1/2. Both the Tukey median estimator and the projection algori… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(5 citation statements)
references
References 8 publications
0
5
0
Order By: Relevance
“…We introduce Algorithm 1, achieving the optimal sample complexity of O(d/ min{αε, α 2 }) (Theorem 5). The main idea is to find an approximate Tukey median (which is known to be a robust estimate of the mean [85]), using the exponential mechanism of [70] to preserve privacy. Tukey median set.…”
Section: Background On Exponential Time Approaches For Gaussian Distr...mentioning
confidence: 99%
See 1 more Smart Citation
“…We introduce Algorithm 1, achieving the optimal sample complexity of O(d/ min{αε, α 2 }) (Theorem 5). The main idea is to find an approximate Tukey median (which is known to be a robust estimate of the mean [85]), using the exponential mechanism of [70] to preserve privacy. Tukey median set.…”
Section: Background On Exponential Time Approaches For Gaussian Distr...mentioning
confidence: 99%
“…In particular, under our model, it achieves the optimal sample complexity and accuracy. This optimality follows from the well-known fact that the sample complexity of O((1/α 2 )(d + log(1/ζ))) cannot be improved upon even if we have no corruption, and the fact that the accuracy of O(α) cannot be improved upon even if we have infinite samples [85]. However, finding a Tukey median takes exponential time scaling as Õ(n d ) [68].…”
Section: Background On Exponential Time Approaches For Gaussian Distr...mentioning
confidence: 99%
“…We use 140 face images from Brazilian face database 4 , where 100 of them are well-controlled frontal faces with neutral expressions, which are considered to be inliers. The rest of 40 images either have non-frontal orientation of the face, or have upside-down smiling expressions, which are considered to be outliers.…”
Section: Face Imagesmentioning
confidence: 99%
“…Classical robust mean estimation methods such as coordinate-wise median and geometric median have error bounds that scale with the dimension of the data [1], which results in poor performance in the high dimensional regime. A notable exception is Tukey's Median [2] that has an error bound that is independent of the dimension, when the fraction of outliers is less than a threshold [3,4]. However, the computational complexity of Tukey's Median algorithm is exponential in the dimension.…”
Section: Introductionmentioning
confidence: 99%
“…The presence of an ε-fraction of adversarial corruption not only makes the empirical average a bad estimator for the mean, but also makes it information-theoretically impossible to reduce the estimation error towards zero even with infinitely many examples. For estimating the mean of a spherical Gaussian, the Tukey median [39] achieves minimax-optimal error guarantees [7,45], but it is hard to compute in general [26,17]; the linear-time-computable coordinate-wise median gives much worse L 2 errors that grow with the dimension. Recent breakthroughs by [12,28] led to the best of both worlds: there are nearly-linear time algorithms achieving information-theoretically optimal error guarantees [9,11,19,23].…”
Section: Introductionmentioning
confidence: 99%