Escaping From Saddle Points --- Online Stochastic Gradient for Tensor Decomposition

Ge, Rong; Huang, Furong; Jin, Chi; Yang, Yuan

doi:10.48550/arxiv.1503.02101

Cited by 173 publications

(48 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Combined with the results in [GHJY15,LSJR16] (see Theorem 2.3) 4 , we have, Theorem 1.2 (Informal). With high probability, stochastic gradient descent on the regularized objective (1.2) will converge to a solution X such that XX T = ZZ T = M in polynomial time from any starting point.…”

Section: Resultssupporting

confidence: 65%

See 1 more Smart Citation

Matrix Completion has No Spurious Local Minimum

Ge¹,

Lee²,

Ma³

2016

Preprint

Self Cite

View full text Add to dashboard Cite

Matrix completion is a basic machine learning problem that has wide applications, especially in collaborative filtering and recommender systems. Simple non-convex optimization algorithms are popular and effective in practice. Despite recent progress in proving various non-convex algorithms converge from a good initial point, it remains unclear why random or arbitrary initialization suffices in practice. We prove that the commonly used non-convex objective function for positive semidefinite matrix completion has no spurious local minima -all local minima must also be global. Therefore, many popular optimization algorithms such as (stochastic) gradient descent can provably solve positive semidefinite matrix completion with arbitrary initialization in polynomial time. The result can be generalized to the setting when the observed entries contain noise. We believe that our main proof strategy can be useful for understanding geometric properties of other statistical problems involving partial or noisy observations.

show abstract

Section: Resultssupporting

confidence: 65%

“…Our characterization of the structure in the objective function implies that (stochastic) gradient descent from arbitrary starting point converge to a global minimum. This is because gradient descent converges to a local minimum [GHJY15,LSJR16], and every local minimum is also a global minimum.…”

Section: Introductionmentioning

confidence: 99%

Matrix Completion has No Spurious Local Minimum

Ge¹,

Lee²,

Ma³

2016

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…In this regime, the batch gradient method often fails with random initialization. As always believed, stochastic algorithms are efficient in escaping bad local minimums or saddle points in nonconvex optimization because of the inherent noise [26,60]. We observe numerically that IRWF and block IRWF from random starting point still converge to global minimum even with very small sample size which is close to the theoretical limits [57].…”

Section: Discussionsupporting

confidence: 68%

Reshaped Wirtinger Flow and Incremental Algorithm for Solving Quadratic System of Equations

Zhang,

Zhou,

Liang

et al. 2016

Preprint

View full text Add to dashboard Cite

We study the phase retrieval problem, which solves quadratic system of equations, i.e., recovers a vector x ∈ R n from its magnitude measurements yi = | ai, x |, i = 1, ..., m. We develop a gradientlike algorithm (referred to as RWF representing reshaped Wirtinger flow) by minimizing a nonconvex nonsmooth loss function. In comparison with existing nonconvex Wirtinger flow (WF) algorithm [1], although the loss function becomes nonsmooth, it involves only the second power of variable and hence reduces the complexity. We show that for random Gaussian measurements, RWF enjoys geometric convergence to a global optimal point as long as the number m of measurements is on the order of n, the dimension of the unknown x. This improves the sample complexity of WF, and achieves the same sample complexity as truncated Wirtinger flow (TWF) [2], but without truncation in gradient loop. Furthermore, RWF costs less computationally than WF, and runs faster numerically than both WF and TWF. We further develop the incremental (stochastic) reshaped Wirtinger flow (IRWF) and show that IRWF converges linearly to the true signal. We further establish performance guarantee of an existing Kaczmarz method for the phase retrieval problem based on its connection to IRWF. We also empirically demonstrate that IRWF outperforms existing ITWF algorithm (stochastic version of TWF) as well as other batch algorithms.

show abstract

“…Furthermore, our study on the local landscape of Fourier expansions reveals that the existing of saddle points is an obstacle for us to design algorithms with theoretical guarantees. Previous research shows that gradient-based algorithms are in particular susceptible to saddle point problems [36]. Although the study of [37,38] indicate that stochastic gradient descent with random noise is enough to escape saddle points, strict saddle property, which is not valid in our case, is assumed in the work above.…”

Section: Introductionmentioning

confidence: 83%

FourierSAT: A Fourier Expansion-Based Algebraic Framework for Solving Hybrid Boolean Constraints

Kyrillidis

Shrivastava

Vardi

et al. 2019

Preprint

View full text Add to dashboard Cite

The Boolean SATisfiability problem (SAT) is of central importance in computer science. Although SAT is known to be NP-complete, progress on the engineering side-especially that of Conflict-Driven Clause Learning (CDCL) and Local Search SAT solvers-has been remarkable. Yet, while SAT solvers, aimed at solving industrial-scale benchmarks in Conjunctive Normal Form (CNF), have become quite mature, SAT solvers that are effective on other types of constraints (e.g., cardinality constraints and XORs) are less well-studied; a general approach to handling non-CNF constraints is still lacking. In addition, previous work indicated that for specific classes of benchmarks, the running time of extant SAT solvers depends heavily on properties of the formula and details of encoding, instead of the scale of the benchmarks, which adds uncertainty to expectations of running time. To address the issues above, we design FourierSAT 3 , an incomplete SAT solver based on Fourier analysis of Boolean functions, a technique to represent Boolean functions by multilinear polynomials. By such a reduction to continuous optimization, we propose an algebraic framework for solving systems consisting of different types of constraints. The idea is to leverage gradient information to guide the search process in the direction of local improvements. Empirical results demonstrate that FourierSAT is more robust than other solvers on certain classes of benchmarks.

show abstract

Escaping From Saddle Points --- Online Stochastic Gradient for Tensor Decomposition

Cited by 173 publications

References 20 publications

Matrix Completion has No Spurious Local Minimum

Matrix Completion has No Spurious Local Minimum

Reshaped Wirtinger Flow and Incremental Algorithm for Solving Quadratic System of Equations

FourierSAT: A Fourier Expansion-Based Algebraic Framework for Solving Hybrid Boolean Constraints

Contact Info

Product

Resources

About