Perturbation bounds for singular subspaces have a wide range of applications, including high‐dimensional statistics, machine learning, and applied mathematics. This paper focuses on the perturbation bounds of singular subspaces under a signal matrix interfered by a special heteroscedastic noise matrix in which its same row or column shares similar variance. First of all, we extend homoscedastic results of T. Tony Cai and Anru Zhang (see Rate‐optimal perturbation bounds for singular subspaces with applications to high‐dimensional statistics, The Annals of Statistics, 2018, 46(1), 60–89) to heteroscedastic cases. Then we apply the developed tools to the heteroskedastic clustering model. We find out that our upper bound of clustering misclassification rate is better than the one of T. Tony Cai, Rungang Han, and Anru Zhang (see arXiv:2008.12434, 2020).
Clustering is an important tool in statistics, machine learning and applied mathematics. This paper considers the clustering model
, where the noise matrix
consists of independent sub‐Gaussian entries
and the variance
may vary across different coordinates. Our aim is to estimate the error between the label vector
and its defined estimator
. We provide upper bound estimations for the misclassification rate in the sense of expectation and probability, respectively. Finally, some simulations have been carried out to support our theoretical results and illustrate the advantage of our proposed estimator.
Motivated by Jin–Ke–Wang’s work (J. Jin, Z. T. Ke and W. Wang, Ann. Statist. 45(5) (2017) 2151–2189), this paper studies estimation of misclassification rate in the Asymptotic Rare and Weak (ARW) model. In contrast to Jin–Ke–Wang’s theorem, we measure the performance of the estimator by the misclassification rate instead of Hamming distance, and extend the Gaussian noise to sub-Gaussian’s. The probability estimation with convergence rate is first given under some conditions. Then we prove that condition necessary as well. A direct corollary of our estimation can be compared with Jin–Ke–Wang’s theorem. It turns out that our statistical limit coincides with theirs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.