2018
DOI: 10.1609/aaai.v32i1.11586
|View full text |Cite
|
Sign up to set email alerts
|

Decentralised Learning in Systems With Many, Many Strategic Agents

Abstract: Although multi-agent reinforcement learning can tackle systems of strategically interacting entities, it currently fails in scalability and lacks rigorous convergence guarantees. Crucially, learning in multi-agent systems can become intractable due to the explosion in the size of the state-action space as the number of agents increases. In this paper, we propose a method for computing closed-loop optimal policies in multi-agent systems that scales independently of the number of agents. This allows us to show, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 21 publications
(8 citation statements)
references
References 10 publications
(17 reference statements)
0
8
0
Order By: Relevance
“…We use GridSearch(M) to denote the zeroth-order method, which spreads 𝑀 number of points into the feasible region for incentive parameter 𝜃 , as a competitor to DASAC. Additionally, we adopt Bayesian optimization to improve the efficiency of the zeroth-order method, denoted as BayesOpt, which is also used in [64]. The experimental details of GridSearch(M) and BayesOpt are included in Appendix G.2.…”
Section: Results and Analysismentioning
confidence: 99%
See 2 more Smart Citations
“…We use GridSearch(M) to denote the zeroth-order method, which spreads 𝑀 number of points into the feasible region for incentive parameter 𝜃 , as a competitor to DASAC. Additionally, we adopt Bayesian optimization to improve the efficiency of the zeroth-order method, denoted as BayesOpt, which is also used in [64]. The experimental details of GridSearch(M) and BayesOpt are included in Appendix G.2.…”
Section: Results and Analysismentioning
confidence: 99%
“…[54] utilize the incentive design in the multi-bandit problem and prove that the proposed algorithm converges to the global optimum at a sub-linear rate for a broad class of games. [64] provide an incentive-design mechanism for an uncooperative multi-agent system and optimize the upper-level incentive objective with Bayesian optimization, a sample-efficient optimization algorithm, instead of the gradient-based methods because the lower-level MARL problem is a black box. [90] propose a decentralized incentive mechanism that allows each individual to directly give rewards to others and learn its own incentive function, respectively.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The question of inducing a desirable outcome in non-cooperative games can be traced back to the work of Pigou (1920) on welfare economics. Bilevel programming has long been recognized as the standard approach to such inquiries in operations research, and more recently in the ML community (Mguni et al, 2019;Zheng et al, 2020;Liu et al, 2022;Maheshwari et al, 2022). Our algorithms are focused on the applications pertinent to this question.…”
Section: Bilevel Programs With Equilibrium Constraintsmentioning
confidence: 99%
“…A typical example is a Stackelberg game concerning a leader who aims to induce a desirable outcome in an economic or social system comprised of many self-interested followers, who can be seen as playing a non-cooperative game that converges to a Nash equilibrium (Dafermos, 1973;Requate, 1993;Marcotte and Marquis, 1992;Labbé et al, 1998;Ehtamo et al, 2002). More recently, motivated by such applications, developing efficient algorithms for solving bilevel programs with equilibrium constraints has also emerged as an essential topic in machine learning (Mguni et al, 2019;Zheng et al, 2020;Liu et al, 2022;Maheshwari et al, 2022).…”
Section: Introductionmentioning
confidence: 99%