Decentralised Learning in Systems With Many, Many Strategic Agents

Mguni, David; Jennings, Joel; Cote, Enrique Muñoz de

doi:10.1609/aaai.v32i1.11586

Cited by 21 publications

(8 citation statements)

References 10 publications

(17 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We use GridSearch(M) to denote the zeroth-order method, which spreads 𝑀 number of points into the feasible region for incentive parameter 𝜃 , as a competitor to DASAC. Additionally, we adopt Bayesian optimization to improve the efficiency of the zeroth-order method, denoted as BayesOpt, which is also used in [64]. The experimental details of GridSearch(M) and BayesOpt are included in Appendix G.2.…”

Section: Results and Analysismentioning

confidence: 99%

“…[54] utilize the incentive design in the multi-bandit problem and prove that the proposed algorithm converges to the global optimum at a sub-linear rate for a broad class of games. [64] provide an incentive-design mechanism for an uncooperative multi-agent system and optimize the upper-level incentive objective with Bayesian optimization, a sample-efficient optimization algorithm, instead of the gradient-based methods because the lower-level MARL problem is a black box. [90] propose a decentralized incentive mechanism that allows each individual to directly give rewards to others and learn its own incentive function, respectively.…”

Section: Related Workmentioning

confidence: 99%

“…However, there is no theoretical guarantee for finding the optimal NE subject to the designer's objective [21]. The other direction is to keep the bi-level problem structure and avoid the derivative issue by applying a gradient-free optimizer to the upper level [64]. However, this suffers from a high computation cost since a zeroth-order method typically requires a large number of queries to the lower-level solver to derive the desired NE.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Differentiable Arbitrating in Zero-sum Markov Games

Wang¹,

Song²,

Gao³

et al. 2023

Preprint

View full text Add to dashboard Cite

We initiate the study of how to perturb the reward in a zero-sum Markov game with two players to induce a desirable Nash equilibrium, namely arbitrating. Such a problem admits a bi-level optimization formulation. The lower level requires solving the Nash equilibrium under a given reward function, which makes the overall problem challenging to optimize in an end-to-end way. We propose a backpropagation scheme that differentiates through the Nash equilibrium, which provides the gradient feedback for the upper level. In particular, our method only requires a black-box solver for the (regularized) Nash equilibrium (NE). We develop the convergence analysis for the proposed framework with proper black-box NE solvers and demonstrate the empirical successes in two multi-agent reinforcement learning (MARL) environments.

show abstract

Section: Results and Analysismentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Differentiable Arbitrating in Zero-sum Markov Games

Wang¹,

Song²,

Gao³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…The question of inducing a desirable outcome in non-cooperative games can be traced back to the work of Pigou (1920) on welfare economics. Bilevel programming has long been recognized as the standard approach to such inquiries in operations research, and more recently in the ML community (Mguni et al, 2019;Zheng et al, 2020;Liu et al, 2022;Maheshwari et al, 2022). Our algorithms are focused on the applications pertinent to this question.…”

Section: Bilevel Programs With Equilibrium Constraintsmentioning

confidence: 99%

“…A typical example is a Stackelberg game concerning a leader who aims to induce a desirable outcome in an economic or social system comprised of many self-interested followers, who can be seen as playing a non-cooperative game that converges to a Nash equilibrium (Dafermos, 1973;Requate, 1993;Marcotte and Marquis, 1992;Labbé et al, 1998;Ehtamo et al, 2002). More recently, motivated by such applications, developing efficient algorithms for solving bilevel programs with equilibrium constraints has also emerged as an essential topic in machine learning (Mguni et al, 2019;Zheng et al, 2020;Liu et al, 2022;Maheshwari et al, 2022).…”

Section: Introductionmentioning

confidence: 99%

Achieving Hierarchy-Free Approximation for Bilevel Programs With Equilibrium Constraints

Li¹,

Yu²,

Liu³

et al. 2023

Preprint

View full text Add to dashboard Cite

In this paper, we develop an approximation scheme for solving bilevel programs with equilibrium constraints, which are generally difficult to solve. Among other things, calculating the first-order derivative in such a problem requires differentiation across the hierarchy, which is computationally intensive, if not prohibitive. To bypass the hierarchy, we propose to bound such bilevel programs, equivalent to multiple-followers Stackelberg games, with two new hierarchy-free problems: a T -step Cournot game and a T -step monopoly model.Since they are standard equilibrium or optimization problems, both can be efficiently solved via first-order methods. Importantly, we show that the bounds provided by these problemsthe upper bound by the T -step Cournot game and the lower bound by the T -step monopoly model -can be made arbitrarily tight by increasing the step parameter T for a wide range of problems. We prove that a small T usually suffices under appropriate conditions to reach an approximation acceptable for most practical purposes. Eventually, the analytical insights are highlighted through numerical examples.

show abstract

Decision making in open agent systems

Eck,

Soh,

Doshi

2023

AI Magazine

View full text Add to dashboard Cite

In many real‐world applications of AI, the set of actors and tasks are not constant, but instead change over time. Robots tasked with suppressing wildfires eventually run out of limited suppressant resources and need to temporarily disengage from the collaborative work in order to recharge, or they might become damaged and leave the environment permanently. In a large business organization, objectives and goals change with the market, requiring workers to adapt to perform different sets of tasks across time. We call these multiagent systems (MAS) open agent systems (OASYS), and the openness of the sets of agents and tasks necessitates new capabilities and modeling for decision making compared to planning and learning in closed environments. In this article, we discuss three notions of openness: agent openness, task openness, and type openness. We also review the past and current research on addressing the novel challenges brought about by openness in OASYS. We share lessons learned from these efforts and suggest directions for promising future work in this area. We also encourage the community to engage and participate in this area of MAS research to address critical real‐world problems in the application of AI to enhance our daily lives.

show abstract

Decentralised Learning in Systems With Many, Many Strategic Agents

Cited by 21 publications

References 10 publications

Differentiable Arbitrating in Zero-sum Markov Games

Differentiable Arbitrating in Zero-sum Markov Games

Achieving Hierarchy-Free Approximation for Bilevel Programs With Equilibrium Constraints

Decision making in open agent systems

Contact Info

Product

Resources

About