We propose an efficient distributed randomized coordinate descent method for minimizing regularized non-strongly convex loss functions. The method attains the optimal O(1/k 2 ) convergence rate, where k is the iteration counter. The core of the work is the theoretical study of stepsize parameters. We have implemented the method on Archer-the largest supercomputer in the UK-and show that the method is capable of solving a (synthetic) LASSO optimization problem with 50 billion variables.
The design and complexity analysis of randomized coordinate descent methods, and in particular of variants which update a random subset (sampling) of coordinates in each iteration, depends on the notion of expected separable overapproximation (ESO). This refers to an inequality involving the objective function and the sampling, capturing in a compact way certain smoothness properties of the function in a random subspace spanned by the sampled coordinates. ESO inequalities were previously established for special classes of samplings only, almost invariably for uniform samplings. In this paper we develop a systematic technique for deriving these inequalities for a large class of functions and for arbitrary samplings. We demonstrate that one can recover existing ESO results using our general approach, which is based on the study of eigenvalues associated with samplings and the data describing the function.
We study the problem of minimizing the sum of a smooth convex function and a convex blockseparable regularizer and propose a new randomized coordinate descent method, which we call ALPHA. Our method at every iteration updates a random subset of coordinates, following an arbitrary distribution. No coordinate descent methods capable to handle an arbitrary sampling have been studied in the literature before for this problem. ALPHA is a remarkably flexible algorithm: in special cases, it reduces to deterministic and randomized methods such as gradient descent, coordinate descent, parallel coordinate descent and distributed coordinate descentboth in nonaccelerated and accelerated variants. The variants with arbitrary (or importance) sampling are new. We provide a complexity analysis of ALPHA, from which we deduce as a direct corollary complexity bounds for its many variants, all matching or improving best known bounds.
Max-plus based methods have been recently developed to approximate the value function of possibly high dimensional optimal control problems. A critical step of these methods consists in approximating a function by a supremum of a small number of functions (max-plus "basis functions") taken from a prescribed dictionary. We study several variants of this approximation problem, which we show to be continuous versions of the facility location and k-center combinatorial optimization problems, in which the connection costs arise from a Bregman distance. We give theoretical error estimates, quantifying the number of basis functions needed to reach a prescribed accuracy. We derive from our approach a refinement of the curse of dimensionality free method introduced previously by McEneaney, with a higher accuracy for a comparable computational cost.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.