2017 IEEE International Symposium on Information Theory (ISIT) 2017
DOI: 10.1109/isit.2017.8006706
|View full text |Cite
|
Sign up to set email alerts
|

Information-theoretic bounds and phase transitions in clustering, sparse PCA, and submatrix localization

Abstract: We study the problem of detecting a structured, low-rank signal matrix corrupted with additive Gaussian noise. This includes clustering in a Gaussian mixture model, sparse PCA, and submatrix localization. Each of these problems is conjectured to exhibit a sharp information-theoretic threshold, below which the signal is too weak for any algorithm to detect. We derive upper and lower bounds on these thresholds by applying the first and second moment methods to the likelihood ratio between these "planted models" … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

1
40
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 16 publications
(41 citation statements)
references
References 40 publications
1
40
0
Order By: Relevance
“…While bounds on the second moment do not a priori imply anything about the recovery problem, it follows from results of Banks et al (2017) that many of our non-detection results imply the corresponding non-recovery results. The value of the second moment also yields bounds on hypothesis testing power (see Proposition 2.5).…”
mentioning
confidence: 61%
See 2 more Smart Citations
“…While bounds on the second moment do not a priori imply anything about the recovery problem, it follows from results of Banks et al (2017) that many of our non-detection results imply the corresponding non-recovery results. The value of the second moment also yields bounds on hypothesis testing power (see Proposition 2.5).…”
mentioning
confidence: 61%
“…There will be times when the above second moment is unbounded but we are still able to prove contiguity using a modified second moment that conditions away from rare 'bad' events that would otherwise dominate the second moment. This idea has appeared previously (Arias-Castro and Verzelen, 2014;Verzelen and Arias-Castro, 2015;Banks et al, 2016Banks et al, , 2017.…”
mentioning
confidence: 67%
See 1 more Smart Citation
“…The supervised on labeled curve corresponds to (2) with the parameter α replaced by αη and is the best possible performance when only a fraction η of the data points having labels are used. The unsupervised curve corresponds to (3) where all the data points are used but without any label. Finally, the oracle curve corresponds to (1) where the centers of the clusters are known (corresponding to the case α → ∞).…”
Section: Unsupervised Casementioning
confidence: 99%
“…Using exact but non-rigorous methods from statistical physics, [6,24] determine the critical values for α and σ at which it becomes information-theoretically possible to reconstruct the membership into clusters better than chance. Rigorous results on this model are given in [3] where bounds on the critical values are obtained. The precise thresholds were then determined in [27].…”
Section: Related Workmentioning
confidence: 99%