We observe a $N\times M$ matrix $Y_{ij}=s_{ij}+\xi_{ij}$ with $\xi_{ij}\sim {\mathcal {N}}(0,1)$ i.i.d. in $i,j$, and $s_{ij}\in \mathbb {R}$. We test the null hypothesis $s_{ij}=0$ for all $i,j$ against the alternative that there exists some submatrix of size $n\times m$ with significant elements in the sense that $s_{ij}\ge a>0$. We propose a test procedure and compute the asymptotical detection boundary $a$ so that the maximal testing risk tends to 0 as $M\to\infty$, $N\to\infty$, $p=n/N\to0$, $q=m/M\to0$. We prove that this boundary is asymptotically sharp minimax under some additional constraints. Relations with other testing problems are discussed. We propose a testing procedure which adapts to unknown $(n,m)$ within some given set and compute the adaptive sharp rates. The implementation of our test procedure on synthetic data shows excellent behavior for sparse, not necessarily squared matrices. We extend our sharp minimax results in different directions: first, to Gaussian matrices with unknown variance, next, to matrices of random variables having a distribution from an exponential family (non-Gaussian) and, finally, to a two-sided alternative for matrices with Gaussian elements.Comment: Published in at http://dx.doi.org/10.3150/12-BEJ470 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
We study the problem of detection of a p-dimensional sparse vector of parameters in the linear regression model with Gaussian noise. We establish the detection boundary, i.e., the necessary and sufficient conditions for the possibility of successful detection as both the sample size n and the dimension p tend to infinity. Testing procedures that achieve this boundary are also exhibited. Our results encompass the high-dimensional setting (p >> n). The main message is that, under some conditions, the detection boundary phenomenon that has been previously established for the Gaussian sequence model, extends to high-dimensional linear regression. Finally, we establish the detection boundaries when the variance of the noise is unknown. Interestingly, the rate of the detection boundary in high-dimensional setting with unknown variance can be different from the rate for the case of known variance
Ill-posed inverse problems arise in various scientific fields. We consider the signal detection problem for mildly, severely and extremely ill-posed inverse problems with $l^q$-ellipsoids (bodies), $q\in(0,2]$, for Sobolev, analytic and generalized analytic classes of functions under the Gaussian white noise model. We study both rate and sharp asymptotics for the error probabilities in the minimax setup. By construction, the derived tests are, often, nonadaptive. Minimax rate-optimal adaptive tests of rather simple structure are also constructed.Comment: Published in at http://dx.doi.org/10.1214/12-AOS1011 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org
We study the problem of classification of d-dimensional vectors into two classes (one of which is 'pure noise') based on a training sample of size m. The main specific feature is that the dimension d can be very large. We suppose that the difference between the distribution of the population and that of the noise is only in a shift, which is a sparse vector. For Gaussian noise, fixed sample size m, and dimension d that tends to infinity, we obtain the sharp classification boundary, i.e. the necessary and sufficient conditions for the possibility of successful classification. We propose classifiers attaining this boundary. We also give extensions of the result to the case where the sample size m depends on d and satisfies the condition (log m)/ log d → γ , 0 ≤ γ < 1, and to the case of non-Gaussian noise satisfying the Cramér condition.
We observe a N × M matrix of independent, identically distributed Gaussian random variables which are centered except for elements of some submatrix of size n × m where the mean is larger than some a > 0. The submatrix is sparse in the sense that n/N and m/M tend to 0, whereas n, m, N and M tend to infinity.We consider the problem of selecting the random variables with significantly large mean values. We give sufficient conditions on a as a function of n, m, N and M and construct a uniformly consistent procedure in order to do sharp variable selection. We also prove the minimax lower bounds under necessary conditions which are complementary to the previous conditions. The critical values a * separating the necessary and sufficient conditions are sharp (we show exact constants).We note a gap between the critical values a * for selection of variables and that of detecting that such a submatrix exists given by [7]. When a * is in this gap, consistent detection is possible but no consistent selector of the corresponding variables can be found.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.