This paper considers general rank-constrained optimization problems that minimize a general objective function f (X) over the set of rectangular n × m matrices that have rank at most r. To tackle the rank constraint and also to reduce the computational burden, we factorize X into U V T where U and V are n × r and m × r matrices, respectively, and then optimize over the small matrices U and V . We characterize the global optimization geometry of the nonconvex factored problem and show that the corresponding objective function satisfies the robust strict saddle property as long as the original objective function f satisfies restricted strong convexity and smoothness properties, ensuring global convergence of many local search algorithms (such as noisy gradient descent) in polynomial time for solving the factored problem. We also provide a comprehensive analysis for the optimization geometry of a matrix factorization problem where we aim to find n × r and m × r matrices U and V such that U V T approximates a given matrix X . Aside from the robust strict saddle property, we show that the objective function of the matrix factorization problem has no spurious local minima and obeys the strict saddle property not only for the exact-parameterization case where rank(X ) = r, but also for the over-parameterization case where rank(X ) < r and the under-parameterization case where rank(X ) > r. These geometric properties imply that a number of iterative optimization algorithms (such as gradient descent) converge to a global solution with random initialization.
This work considers two popular minimization problems: (i) the minimization of a general convex function f (X) with the domain being positive semi-definite matrices; (ii) the minimization of a general convex function f (X) regularized by the matrix nuclear norm X * with the domain being general matrices. Despite their optimal statistical performance in the literature, these two optimization problems have a high computational complexity even when solved using tailored fast convex solvers. To develop faster and more scalable algorithms, we follow the proposal of Burer and Monteiro to factor the low-rank variable X = UU (for semi-definite matrices) or X = UV (for general matrices) and also replace the nuclear norm X * with ( U 2 F + V 2 F )/2. In spite of the non-convexity of the resulting factored formulations, we prove that each critical point either corresponds to the global optimum of the original convex problems or is a strict saddle where the Hessian matrix has a strictly negative eigenvalue. Such a nice geometric structure of the factored formulations allows many local search algorithms to find a global optimizer even with random initializations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.