This paper considers the decentralized optimization problem, which has applications in large scale machine learning, sensor networks, and control theory. We propose a novel algorithm that can achieve near optimal communication complexity, matching the known lower bound up to a logarithmic factor of the condition number of the problem. Our theoretical results give affirmative answers to the open problem on whether there exists an algorithm that can achieve a communication complexity (nearly) matching the lower bound depending on the global condition number instead of the local one. Moreover, the proposed algorithm achieves the optimal computation complexity matching the lower bound up to universal constants. Furthermore, to achieve a linear convergence rate, our algorithm doesn't require the individual functions to be (strongly) convex. Our method relies on a novel combination of known techniques including Nesterov's accelerated gradient descent, multi-consensus and gradient-tracking. The analysis is new, and may be applied to other related problems. Empirical studies demonstrate the effectiveness of our method for machine learning applications.
In this paper, we follow the work [17] to study quasi-Newton methods, which is based on the updating formulas from a certain subclass of the Broyden family. We focus on the common SR1 and BFGS quasi-Newton methods to establish better explicit superlinear convergence. First, based on greedy quasi-Newton update in [17], which greedily selected the direction so as to maximize a certain measure of progress, we improve the linear convergence rate to a condition-number-free superlinear convergence rate, when applied with the well-known SR1 update, and BFGS update. Moreover, our results can also be applied to the inverse approximation of the SR1 update. Second, based on random update, that selects the direction randomly from any spherical symmetry distribution we show the same superlinear convergence rate established as above. Our analysis is closely related to the approximation of a given Hessian matrix, unconstrained quadratic objective, as well as the general strongly convex, smooth and strongly self-concordant functions.
We study the convergence rate of the famous Symmetric Rank-1 (SR1) algorithm which has wide applications in different scenarios. Although it has been extensively investigated, SR1 still lacks a non-asymptotic superlinear rate compared with other quasi-Newton methods such as DFP and BFGS. In this paper we address this problem. Inspired by the recent work on explicit convergence analysis of quasi-Newton methods, we obtain the first explicit non-asymptotic rates of superlinear convergence for the vanilla SR1 methods with correction strategy to achieve the numerical stability. Specifically, the vanilla SR1 with the correction strategy achieves the rates of the form 4n ln(eκ)for general smooth strongly-convex functions where k is the iteration counter, κ is the condition number of the objective function and n is the dimension of the problem. For the quadratic function, the vanilla SR1 algorithm can find the optima of the objective function at most n steps.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.