In this paper, we investigate the non-asymptotic stationary convergence behavior of Stochastic Mirror Descent (SMD) for nonconvex optimization. We focus on a general class of nonconvex nonsmooth stochastic optimization problems, in which the objective can be decomposed into a relatively weakly convex function (possibly non-Lipschitz) and a simple non-smooth convex regularizer. We prove that SMD, without the use of mini-batch, is guaranteed to converge to a stationary point in a convergence rate of O(1/ √ t). The efficiency estimate matches with existing results for stochastic subgradient method, but is evaluated under a stronger stationarity measure. Our convergence analysis applies to both the original SMD and its proximal version, as well as the deterministic variants, for solving relatively weakly convex problems.
In the paper, we develop a composite version of Mirror Prox algorithm for solving convexconcave saddle point problems and monotone variational inequalities of special structure, allowing to cover saddle point/variational analogies of what is usually called "composite minimization" (minimizing a sum of an easy-to-handle nonsmooth and a general-type smooth convex functions "as if" there were no nonsmooth component at all). We demonstrate that the composite Mirror Prox inherits the favourable (and unimprovable already in the large-scale bilinear saddle point case) O(1/ ) efficiency estimate of its prototype. We demonstrate that the proposed approach can be successfully applied to Lasso-type problems with several penalizing terms (e.g. acting together 1 and nuclear norm regularization) and to problems of semi-separable structures considered in the alternating directions methods, implying in both cases methods with the O(1/ ) complexity bounds.
In this paper, we introduce a unified framework for analyzing a large family of Q-learning algorithms, based on switching system perspectives and ODE-based stochastic approximation. We show that the nonlinear ODE models associated with these Q-learning algorithms can be formulated as switched linear systems, and analyze their asymptotic stability by leveraging existing switching system theories. Our approach provides the first O.D.E. analysis of the asymptotic convergence of various Q-learning algorithms, including asynchronous Q-learning and averaging Q-learning. We also extend the approach to analyze Q-learning with linear function approximation and derive a new sufficient condition for its convergence.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.