In an online convex optimization problem a decision-maker makes a sequence of decisions, i.e., chooses a sequence of points in Euclidean space, from a fixed feasible set. After each point is chosen, it encounters a sequence of (possibly unrelated) convex cost functions. Zinkevich (ICML 2003) introduced this framework, which models many natural repeated decision-making problems and generalizes many existing problems such as Prediction from Expert Advice and Cover's Universal Portfolios. Zinkevich showed that a simple online gradient descent algorithm achieves additive regret O( √ T ), for an arbitrary sequence of T convex cost functions (of bounded gradients), with respect to the best single decision in hindsight.In this paper, we give algorithms that achieve regret O(log(T )) for an arbitrary sequence of strictly convex functions (with bounded first and second derivatives). This mirrors what has been done for the special cases of prediction from expert advice by Kivinen and Warmuth (EuroCOLT 1999), and Universal Portfolios by Cover (Math. Finance 1: [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19] 1991). We propose several algorithms achieving logarithmic regret, which besides being more general are also much more efficient to implement.The main new ideas give rise to an efficient algorithm based on the Newton method for optimization, a new tool in the field. Our analysis shows a surprising connection between the natural follow-the-leader approach and the Newton method. We also analyze other algorithms, which tie together several different previous approaches including follow-the-leader, exponential weighting, Cover's algorithm and gradient descent.
Lagrangian relaxation and approximate optimization algorithms have received much attention in the last two decades. Typically, the running time of these methods to obtain a ε approximate solution is proportional to 1 ε 2 . Recently, Bienstock and Iyengar, following Nesterov, gave an algorithm for fractional packing linear programs which runs in 1 ε iterations. The latter algorithm requires to solve a convex quadratic program every iteration -an optimization subroutine which dominates the theoretical running time.We give an algorithm for convex programs with strictly convex constraints which runs in time proportional to 1 ε . The algorithm does not require to solve any quadratic program, but uses gradient steps and elementary operations only. Problems which have strictly convex constraints include maximum entropy frequency estimation, portfolio optimization with loss risk constraints, and various computational problems in signal processing.As a side product, we also obtain a simpler version of Bienstock and Iyengar's result for general linear programming, with similar running time.We derive these algorithms using a new framework for deriving convex optimization algorithms from online game playing algorithms, which may be of independent interest.
We design a non-convex second-order optimization algorithm that is guaranteed to return an approximate local minimum in time which scales linearly in the underlying dimension and the number of training examples. The time complexity of our algorithm to find an approximate local minimum is even faster than that of gradient descent to find a critical point. Our algorithm applies to a general class of optimization problems including training a neural network and other non-convex objectives arising in machine learning.
We study online learning in an oblivious changing environment. The standard measure of regret bounds the difference between the cost of the online learner and the best decision in hindsight. Hence, regret minimizing algorithms tend to converge to the static best optimum, clearly a suboptimal behavior in changing environments. On the other hand, various metrics proposed to strengthen regret and allow for more dynamic algorithms produce inefficient algorithms.We propose a different performance metric which strengthens the standard metric of regret and measures performance with respect to a changing comparator. We then describe a series of datastreaming-based reductions which transform algorithms for minimizing (standard) regret into adaptive algorithms albeit incurring only poly-logarithmic computational overhead.Using this reduction, we obtain efficient low adaptive-regret algorithms for the problem of online convex optimization. This can be applied to various learning scenarios, i.e. online portfolio selection, for which we describe experimental results showing the advantage of adaptivity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.