Rafael Oliveira scite author profile

We develop several efficient algorithms for the classical Matrix Scaling problem, which is used in many diverse areas, from preconditioning linear systems to approximation of the permanent. On an input n × n matrix A, this problem asks to find diagonal (scaling) matrices X and Y (if they exist), so that XAY ε-approximates a doubly stochastic matrix, or more generally a matrix with prescribed row and column sums.We address the general scaling problem as well as some important special cases. In particular, if A has m nonzero entries, and if there exist X and Y with polynomially large entries such that XAY is doubly stochastic, then we can solve the problem in total complexity O(m + n 4/3 ). This greatly improves on the best known previous results, which were either O(n 4 ) or O(mn 1/2 /ε).Our algorithms are based on tailor-made first and second order techniques, combined with other recent advances in continuous optimization, which may be of independent interest for solving similar problems. arXiv:1704.02315v1 [cs.DS] 7 Apr 2017 problems. We say that A is asymptotically (r, c)-scalable if the row and column sums can reach r and c asymptotically: that is, for every > 0, there exist positive diagonal matrices X, Y such that, letting B = XAY , we have B1 − r ≤ ε and 1 B − c ≤ ε. 2 The combinatorial essence of asymptotic scaling follows from a well-known characterization (see Proposition 2.2). A matrix A is asymptotically (1, 1)-scalable if and only if the permanent of A is positive, namely if the bipartite graph defined by the positive entries in A has a perfect matching. A matrix A is asymptotically (r, c)-scalable if and only if a natural flow on the same bipartite graph 3 has a solution. Duality (Hall's theorem and max-flow-min-cut theorem) gives simple certificates of non-scalability in terms of the patterns of 0's in the matrix A.The main computational problem we study is: given a matrix A, vectors r, c and ε > 0, determine if A is ε-approximately (r, c) scalable, and if so, find the scaling matrices X, Y .Before diving into the history of matrix scaling, we explain one of its most basic applications, which also demonstrates its algorithmic importance.Preconditioning Linear Systems. When solving a linear system Az = b, it is often desirable -for numerical stability and efficiency purposes-to have matrix A be well-conditioned. When this is not the case, one tries to transform A into a "better conditioned" matrix A . Matrix scaling provides a natural and efficient reduction to do so. For instance, one would hope that a scaled matrix A , in which e.g. all row and column p-norms are (say) 1, is better conditioned. 4 For this reason, we can use a matrix scaling algorithm to obtain diagonal matrices X, Y , and define A = XAY . Now, the solution to Az = b can be obtained by solving the (hopefully more numerically stable) linear system A z = Xb and setting z = Y −1 z . We stress here that A and A have the same sparsity. History and Prior WorkThe matrix (r, c)-scaling problem is so natural and important that it was discove...

show abstract

Operator Scaling: Theory and Applications

Garg

Gurvits

Oliveira

et al. 2019

Found Comput Math

View full text Add to dashboard Cite

In this paper we present a deterministic polynomial time algorithm for testing if a symbolic matrix in non-commuting variables over Q is invertible or not. The analogous question for commuting variables is the celebrated polynomial identity testing (PIT) for symbolic determinants. In contrast to the commutative case, which has an efficient probabilistic algorithm, the best previous algorithm for the non-commutative setting required exponential time [IQS17] (whether or not randomization is allowed). The algorithm efficiently solves the "word problem" for the free skew field, and the identity testing problem for arithmetic formulae with division over noncommuting variables, two problems which had only exponential-time algorithms prior to this work.The main contribution of this paper is a complexity analysis of an existing algorithm due to Gurvits [Gur04], who proved it was polynomial time for certain classes of inputs. We prove it always runs in polynomial time. The main component of our analysis is a simple (given the necessary known tools) lower bound on central notion of capacity of operators (introduced by Gurvits [Gur04]). We extend the algorithm to actually approximate capacity to any accuracy in polynomial time, and use this analysis to give quantitative bounds on the continuity of capacity (the latter is used in a subsequent paper on Brascamp-Lieb inequalities). We also extend the algorithm to compute not only singularity, but actually the (non-commutative) rank of a symbolic matrix, yielding a factor 2 approximation of the commutative rank. This naturally raises a relaxation of the commutative PIT problem to achieving better deterministic approximation of the commutative rank.Symbolic matrices in non-commuting variables, and the related structural and algorithmic questions, have a remarkable number of diverse origins and motivations. They arise independently in (commutative) invariant theory and representation theory, linear algebra, optimization, linear system theory, quantum information theory, approximation of the permanent and naturally in non-commutative algebra. We provide a detailed account of some of these sources and their interconnections. In particular we explain how some of these sources played an important role in the development of Gurvits' algorithm and in our analysis of it here. * Microsoft Research New England, email: garga@microsoft.com.

show abstract

Algorithmic and optimization aspects of Brascamp-Lieb inequalities, via Operator Scaling

et al. 2018

View full text Add to dashboard Cite

The celebrated Brascamp-Lieb (BL) inequalities [BL76,Lie90], and their reverse form of Barthe [Bar98], are an important mathematical tool, unifying and generalizing numerous inequalities in analysis, convex geometry and information theory, with many used in computer science. While their structural theory is very well understood, far less is known about computing their main parameters (which we later define below). Prior to this work, the best known algorithms for any of these optimization tasks required at least exponential time. In this work, we give polynomial time algorithms to compute: (1) Feasibility of BL-datum, (2) Optimal BLconstant, (3) Weak separation oracle for BL-polytopes.What is particularly exciting about this progress, beyond the better understanding of BLinequalities, is that the objects above naturally encode rich families of optimization problems which had no prior efficient algorithms. In particular, the BL-constants (which we efficiently compute) are solutions to non-convex optimization problems, and the BL-polytopes (for which we provide efficient membership and separation oracles) are linear programs with exponentially many facets. Thus we hope that new combinatorial optimization problems can be solved via reductions to the ones above, and make modest initial steps in exploring this possibility.Our algorithms are obtained by a simple efficient reduction of a given BL-datum to an instance of the Operator Scaling problem defined by [Gur04]. To obtain the results above, we utilize the two (very recent and different) algorithms for the operator scaling problem [GGOW16,IQS15]. Our reduction implies algorithmic versions of many of the known structural results on BL-inequalities, and in some cases provide proofs that are different or simpler than existing ones. Further, the analytic properties of the [GGOW16] algorithm provide new, effective bounds on the magnitude and continuity of BL-constants; prior work relied on compactness, and thus provided no bounds.On a higher level, our application of operator scaling algorithm to BL-inequalities further connects analysis and optimization with the diverse mathematical areas used so far to motivate and solve the operator scaling problem, which include commutative invariant theory, non-commutative algebra, computational complexity and quantum information theory. * Microsoft Research New England, email: garga@microsoft.com.

show abstract

Operator scaling via geodesically convex optimization, invariant theory and polynomial identity testing

Allen-Zhu

Garg

et al. 2018

View full text Add to dashboard Cite

We propose a new second-order method for geodesically convex optimization on the natural hyperbolic metric over positive denite matrices. We apply it to solve the operator scaling problem in time polynomial in the input size and logarithmic in the error. This is an exponential improvement over previous algorithms which were analyzed in the usual Euclidean, commutative metric (for which the above problem is not convex). Our method is general and applicable to other settings.As a consequence, we solve the equivalence problem for the left-right group action underlying the operator scaling problem. This yields a deterministic polynomial-time algorithm for a new class of Polynomial Identity Testing (PIT) problems, which was the original motivation for studying operator scaling. * Microsoft Research Redmond, the downstream applications (such as orbit-closure intersection). Remark 1.1. A special case of operator scaling is the matrix scaling problem (cf. [7,19] and references therein). In matrix scaling, we are given a real matrix with non-negative entries, and asked to re-scale its rows and columns to make it doubly stochastic. In this very special case, one can make a change of variables in the appropriate capacity, and make it convex in the Euclidean metric. This aords standard convex optimization techniques, and for this special case, algorithms running in time poly(n, log M, log 1/ε) are known [7,19,52,60].It is known that for every positive operator T , log(det(T (X))) is geodesically convex in X [74]. Also, it is simple to verify that log(det(X)) is geodesically linear (i.e., both convex and concave) 3 . Hence, if we dene the following alternative objective (removing the hard constraint on det(X)) logcap(X) = log det (T (X)) − log det X (1.1) 1 It is also known as a completely positive operator.2 One can assume Ai's are integral or complex integral without loss of generality. 3 This should be contrasted with the fact that log(det(X)) is a concave function in the Euclidean geometry.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Rafael Oliveira

A Deterministic Polynomial Time Algorithm for Non-commutative Rational Identity Testing

Much Faster Algorithms for Matrix Scaling

Operator Scaling: Theory and Applications

Algorithmic and optimization aspects of Brascamp-Lieb inequalities, via Operator Scaling

Operator scaling via geodesically convex optimization, invariant theory and polynomial identity testing

Contact Info

Product

Resources

About