Continuous-Time Discounted Mirror Descent Dynamics in Monotone Concave Games

Gao, Bolin; Pavel, Lacra

doi:10.1109/tac.2020.3045094

Cited by 27 publications

(47 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Moreover, g i in the coupled constraints may not required to be affine, which is more general than the constraints in previous works [10], [28]. Also, the problem setting does not require strongly or strictly convexity for either cost functions f i or constraint functions g i [10], [27], and the selection qualification for generating function φ i has also been widely used [14], [19], [22].…”

Section: Formulation and Algorithmmentioning

confidence: 99%

“…In recent years, continuous-time MD-based algorithms have also attracted much attention. For example, [14] proposed the acceleration of a continuous-time MD algorithm, and afterward, [21] showed continuous-time stochastic MD for strongly convex functions, while [22] proposed a discounted continuous-time MD dynamics to approximate the exact solution. In the distributed design, although [19] presented a distributed MD dynamics with integral feedback, the result merely achieved optimal consensus and part variables turn to be unbounded.…”

Section: Introductionmentioning

confidence: 99%

“…Moreover, our algorithm well inherits the good capabilities of MDbased approaches to rapidly compute explicit solutions to the problems with some concrete constraint structures like the unit simplex or the Euclidean sphere. With the designed Bregman damping, our MD-based algorithm makes all the variables' trajectories bounded, which could not be ensured in [14], [19], and avoids the inaccuracy of the convergent point occurred in [22].…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Distributed Mirror Descent Algorithm with Bregman Damping for Nonsmooth Constrained Optimization

Chen,

Li,

et al. 2021

Preprint

View full text Add to dashboard Cite

To solve distributed optimization efficiently with various constraints and nonsmooth functions, we propose a distributed mirror descent algorithm with embedded Bregman damping, as a generalization of conventional distributed projection-based algorithms. In fact, our continuous-time algorithm well inherits good capabilities of mirror descent approaches to rapidly compute explicit solutions to the problems with some specific constraint structures. Moreover, we rigorously prove the convergence of our algorithm, along with the boundedness of the trajectory and the accuracy of the solution.

show abstract

Section: Formulation and Algorithmmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Distributed Mirror Descent Algorithm with Bregman Damping for Nonsmooth Constrained Optimization

Chen,

Li,

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…The term softmax itself has been first introduced by Bridle in neural networks, where it is usually employed as an activation function to normalise data [52]. In computer science, applications of softmax are varied: classification methods (again, softmax regression) for supervised and unsupervised learning [53][54][55], computer vision [56][57][58], reinforcement learning [59][60][61] and hardware design [62], just to name some current areas of application. Additionally, a considerable number of conference papers is witnessing the popularity of softmax and its proposed variants [63][64][65][66][67].…”

Section: Plos Onementioning

confidence: 99%

EA3: A softmax algorithm for evidence appraisal aggregation

Pretis

Landes

2021

PLoS ONE

View full text Add to dashboard Cite

Real World Evidence (RWE) and its uses are playing a growing role in medical research and inference. Prominently, the 21st Century Cures Act—approved in 2016 by the US Congress—permits the introduction of RWE for the purpose of risk-benefit assessments of medical interventions. However, appraising the quality of RWE and determining its inferential strength are, more often than not, thorny problems, because evidence production methodologies may suffer from multiple imperfections. The problem arises to aggregate multiple appraised imperfections and perform inference with RWE. In this article, we thus develop an evidence appraisal aggregation algorithm called EA3. Our algorithm employs the softmax function—a generalisation of the logistic function to multiple dimensions—which is popular in several fields: statistics, mathematical physics and artificial intelligence. We prove that EA3 has a number of desirable properties for appraising RWE and we show how the aggregated evidence appraisals computed by EA3 can support causal inferences based on RWE within a Bayesian decision making framework. We also discuss features and limitations of our approach and how to overcome some shortcomings. We conclude with a look ahead at the use of RWE.

show abstract

“…This work is also related to a large body of literature on mirror descent (MD) algorithm [32], [33], and its variant dual averaging (DA), also known as lazy mirror descent [34], [35]. Both MD and DA have been extensively used in online convex optimization [34], [36], online learning for MDP's with changing rewards [17], regret minimization [7], as well as learning NE in continuous games [37]- [40]. Although MD and DA algorithms share similarities in their analysis, DA algorithms are believed to be more robust in the presence of noise, while MD algorithms often provide better convergence rates [35].…”

mentioning

confidence: 99%

Learning Stationary Nash Equilibrium Policies in $n$-Player Stochastic Games with Independent Chains

Etesami¹

2022

Preprint

View full text Add to dashboard Cite

We consider a subclass of n-player stochastic games, in which players have their own internal state/action spaces while they are coupled through their payoff functions. It is assumed that players' internal chains are driven by independent transition probabilities. Moreover, players can only receive realizations of their payoffs but not the actual functions, nor can they observe each others' states/actions. Under some assumptions on the structure of the payoff functions, we develop efficient learning algorithms based on Dual Averaging and Dual Mirror Descent, which provably converge almost surely or in expectation to the set of ǫ-Nash equilibrium policies. In particular, we derive upper bounds on the number of iterates that scale polynomially in terms of the game parameters to achieve an ǫ-Nash equilibrium policy. Besides Markov potential games and linear-quadratic stochastic games, this work provides another interesting subclass of n-player stochastic games that provably admit polynomial-time learning algorithm for finding their ǫ-Nash equilibrium policies. Index TermsStochastic games; stationary Nash equilibrium, dual averaging, dual mirror descent, learning in games. I. INTRODUCTIONSince the early work on the existence of a mixed-strategy Nash equilibrium in static noncooperative games [1], and its extension on the existence of stationary Nash equilibrium policies in dynamic stochastic games [2], substantial research has been done to develop scalable algorithms for computing Nash equilibrium (NE) points in static and dynamic environments. NE provides a stable solution concept for strategic multiagent decision-making systems, which is a desirable property in many applications such as socioeconomic systems [3], network security [4], routing and scheduling [5], among many others [6], [7].Unfortunately, computing NE is generally PPAD-hard [8], and it is unlikely to admit a polynomial-time algorithm. Thus, to overcome this fundamental barrier, two main approaches have been adapted in the past literature: i) searching for relaxed notions of stable solutions such as correlated equilibrium [9], which includes the set of NE, and ii) searching for NE points in special structured games such as potential games [10], or concave games [11]. Thanks to recent advances in the field of learning theory, it is known that some tailored algorithms for finding relaxed notions of equilibrium in case (i) can also be used to compute NE points of structured games in case (ii). For instance, it is known that the so-called no-regret algorithms always converge to the set of coarse correlated equilibria [7], and they can also be used to compute NE in the class of socially concave games [12]. However, such results are mainly developed for static games, in which players repeatedly play the same game and gradually learn the underlying stationary environment. Unfortunately, extensions of such results to dynamic stochastic games [2], [13], in which the state of the game evolves as a result of players' past decisions and the realizations of a stoc...

show abstract

Continuous-Time Discounted Mirror Descent Dynamics in Monotone Concave Games

Cited by 27 publications

References 30 publications

Distributed Mirror Descent Algorithm with Bregman Damping for Nonsmooth Constrained Optimization

Distributed Mirror Descent Algorithm with Bregman Damping for Nonsmooth Constrained Optimization

EA3: A softmax algorithm for evidence appraisal aggregation

Learning Stationary Nash Equilibrium Policies in $n$-Player Stochastic Games with Independent Chains

Contact Info

Product

Resources

About