2022
DOI: 10.1287/moor.2021.1216
|View full text |Cite
|
Sign up to set email alerts
|

Hidden Integrality and Semirandom Robustness of SDP Relaxation for Sub-Gaussian Mixture Model

Abstract: We consider the problem of estimating the discrete clustering structures under the sub-Gaussian mixture model. Our main results establish a hidden integrality property of a semidefinite programming (SDP) relaxation for this problem: while the optimal solution to the SDP is not integer-valued in general, its estimation error can be upper bounded by that of an idealized integer program. The error of the integer program, and hence that of the SDP, are further shown to decay exponentially in the signal-to-noise ra… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 23 publications
0
3
0
Order By: Relevance
“…Existing algorithms [71,24,85,75,34,46,23 Table 1: Comparison of sample complexity, misclassification rate, and computational complexity under different signal strength assumptions. Since SNR ≥ S, the bounds in the second column imply those in the third column.…”
Section: Algorithm Sample Complexitymentioning
confidence: 99%
See 1 more Smart Citation
“…Existing algorithms [71,24,85,75,34,46,23 Table 1: Comparison of sample complexity, misclassification rate, and computational complexity under different signal strength assumptions. Since SNR ≥ S, the bounds in the second column imply those in the third column.…”
Section: Algorithm Sample Complexitymentioning
confidence: 99%
“…In particular, when S 1 and n = Ω(d), it is known that Lloyd's algorithm [71,24], semi-definite relaxations of k-means [85,75,34,46,23] and spectral algorithms [1] achieve an error rate of e −Ω(S) . This rate depends suboptimally on SNR.…”
Section: Introductionmentioning
confidence: 99%
“…signed clustering [18]. Note further that the problem of recovering well separated clusters has been investigated through the lens of semidefinite programming [19][20][21][22][23], which gives an error bound on the recovered cluster matrix or even better proves exact recovery of the cluster labels (up to label permutations). It has also been addressed using non-negative matrix factorisation [24], which gives a probably approximately correct (PAC) Bayesian approach to the problem.…”
Section: Challenges and Achieved Resultsmentioning
confidence: 99%