Energy landscape for large average submatrix detection problems in Gaussian random matrices

Bhamidi, Shankar; Dey, Partha S.; Nobel, Andrew B.

doi:10.1007/s00440-017-0766-0

Cited by 12 publications

(19 citation statements)

References 45 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The problem of finding asymptotically the largest average entry of k × k submatrices of C n was recently studied by Bhamidi et.al. [BDN12] (see also [SN13] for a related study) and questions arising in this paper constitute the motivation for our work. It was shown in [BDN12] using non-constructive methods that the largest achievable average entry of a k × k submatrix of C n is asymptotically with high probability (w.h.p.)…”

Section: Introductionmentioning

confidence: 99%

“…[BDN12] (see also [SN13] for a related study) and questions arising in this paper constitute the motivation for our work. It was shown in [BDN12] using non-constructive methods that the largest achievable average entry of a k × k submatrix of C n is asymptotically with high probability (w.h.p.) (1 + o(1))2 log n/k when n grows and k = O(log n/ log log n) (a more refined distributional result is obtained).…”

Section: Introductionmentioning

confidence: 99%

“…The proof of this result is fairly involved and proceeds by a careful conditioning argument. In particular, we show that for fixed r, conditioned on the event that LAS succeeded in iterating at least r steps, the probability distribution of the "new best matrix" which will be used in constructing the matrix for the next iteration is very close to the largest matrix in the k × n strip of C n , and which is known to have asymptotic average value of 2 log n/k due to result in [BDN12]. Then we show that the matrix produced in step r and the best matrix in the k × n strip among the unseen entries are asymptotically independent.…”

Section: Introductionmentioning

confidence: 99%

“…as n grows, thus factor √ 2 smaller than the global optimum. Motivated by this finding, the authors suggest that the outcome of the LAS algorithm should be also factor √ 2 smaller than the global optimum, however one cannot deduce this from the result of [BDN12] since it is not ruled out that LAS is clever enough to find a "rare" locally maximum matrix with a significantly larger average value than 2 log n/k.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Finding a large submatrix of a Gaussian random matrix

Gamarnik¹,

Li²

2018

Ann. Statist.

View full text Add to dashboard Cite

We consider the problem of finding a k × k submatrix of an n × n matrix with i.i.d. standard Gaussian entries, which has a large average entry. It was shown in [BDN12] using non-constructive methods that the largest average value of a k × k submatrix is 2(1 + o(1)) log n/k with high probability (w.h.p.) when k = O(log n/ log log n). In the same paper an evidence was provided that a natural greedy algorithm called Largest Average Submatrix (LAS) for a constant k should produce a matrix with average entry at most (1 + o(1)) 2 log n/k, namely approximately √ 2 smaller, though no formal proof of this fact was provided.In this paper we show that the matrix produced by the LAS algorithm is indeed (1+o(1)) 2 log n/k w.h.p. when k is constant and n grows. Then by drawing an analogy with the problem of finding cliques in random graphs, we propose a simple greedy algorithm which produces a k × k matrix with asymptotically the same average value (1 + o(1)) 2 log n/k w.h.p., for k = o(log n). Since the greedy algorithm is the best known algorithm for finding cliques in random graphs, it is tempting to believe that beating the factor √ 2 performance gap suffered by both algorithms might be very challenging. Surprisingly, we show the existence of a very simple algorithm which produces a k × k matrix with average value (1 + o k (1))(4/3) 2 log n/k for in fact k = o(n).To get an insight into the algorithmic hardness of this problem, and motivated by methods originating in the theory of spin glasses, we conduct the so-called expected overlap analysis of matrices with average value asymptotically (1 + o(1))α 2 log n/k for a fixed value α ∈ [1,√ 2]. The overlap corresponds to the number of common rows and common columns for pairs of matrices achieving this value (see the paper for details). We discover numerically an intriguing phase transition at α * 5 √ 2/(3 √ 3) ≈ 1.3608.. ∈ [4/3, √ 2]: when α < α * the space of overlaps is a continuous subset of [0, 1] 2 , whereas α = α * marks the onset of discontinuity, and as a result the model exhibits the Overlap Gap Property (OGP) when α > α * , appropriately defined. We conjecture that OGP observed for α > α * also marks the onset of the algorithmic hardness -no polynomial time algorithm exists for finding matrices with average value at least (1 + o(1))α 2 log n/k, when α > α * and k is a growing function of n.

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Finding a large submatrix of a Gaussian random matrix

Gamarnik¹,

Li²

2018

Ann. Statist.

View full text Add to dashboard Cite

show abstract

“…The first one, called Large Average Submatrix (LAS), has been introduced in [31] and analyzed in Ref. [3], and consists in consecutive updates of k rows and k columns, starting from a random k × k submatrix and repeating the updates until a guaranteed convergence to a local maximum, meaning that the resulting submatrix can not be improved by changing only its column or row set. A recently introduced improved version of this algorithm, analysed in [16] and named Iterative Greedy Procedure (IGP) follows a simple greedy scheme: starting by one randomly chosen row, we add the best columns and rows sequentially until a k × k submatrix is recovered.…”

Section: Localization Via Biclustering Methodsmentioning

confidence: 99%

Detection of Cyber-Physical Faults and Intrusions from Physical Correlations

Lokhov

Lemons

McAndrew³

et al. 2016

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)

View full text Add to dashboard Cite

Cyber-physical systems are critical infrastructures that are crucial both to the reliable delivery of resources such as energy, and to the stable functioning of automatic and control architectures. These systems are composed of interdependent physical, control and communications networks described by disparate mathematical models creating scientific challenges that go well beyond the modeling and analysis of the individual networks. A key challenge in cyber-physical defense is a fast online detection and localization of faults and intrusions without prior knowledge of the failure type. We describe a set of techniques for the efficient identification of faults from correlations in physical signals, assuming only a minimal amount of available system information. The performance of our detection method is illustrated on data collected from a large building automation system.

show abstract

On the energy landscape of the mixed even p-spin model

Chen

Handschy

Lerman

2017

Probab. Theory Relat. Fields

View full text Add to dashboard Cite

We investigate the energy landscape of the mixed even p-spin model with Ising spin configurations. We show that for any given energy level between zero and the maximal energy, with overwhelming probability there exist exponentially many distinct spin configurations such that their energies stay near this energy level. Furthermore, their magnetizations and overlaps are concentrated around some fixed constants. In particular, at the level of maximal energy, we prove that the Hamiltonian exhibits exponentially many orthogonal peaks. This improves the results of Chatterjee [20] and Ding-Eldan-Zhai [29], where the former established a logarithmic size of the number of the orthogonal peaks, while the latter proved a polynomial size. Our second main result obtains disorder chaos at zero temperature and at any external field. As a byproduct, this implies that the fluctuation of the maximal energy is superconcentrated when the external field vanishes and obeys a Gaussian limit law when the external field is present.

show abstract

Energy landscape for large average submatrix detection problems in Gaussian random matrices

Cited by 12 publications

References 45 publications

Finding a large submatrix of a Gaussian random matrix

Finding a large submatrix of a Gaussian random matrix

Detection of Cyber-Physical Faults and Intrusions from Physical Correlations

On the energy landscape of the mixed even p-spin model

Contact Info

Product

Resources

About