2022
DOI: 10.1137/21m1426675
|View full text |Cite
|
Sign up to set email alerts
|

Fictitious Play in Zero-Sum Stochastic Games

Abstract: We present a novel variant of fictitious play dynamics combining classical fictitious play with Q-learning for stochastic games and analyze its convergence properties in two-player zerosum stochastic games. Our dynamics involves players forming beliefs on the opponent strategy and their own continuation payoff (Q-function), and playing a greedy best response by using the estimated continuation payoffs. Players update their beliefs from observations of opponent actions.A key property of the learning dynamics is… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 19 publications
(4 citation statements)
references
References 30 publications
0
4
0
Order By: Relevance
“…Computing and learning equilibria in Markov games has attracted considerable interest recently. Most focus has been on the Nash equilibrium in either identical-interestor more generally, potential-games (Fox et al 2022;Leonardos et al 2022;Aydin and Eksin 2023;Ding et al 2022;Zhang et al 2022b), or two-player zero-sum Markov games (Daskalakis, Foster, and Golowich 2020;Cen et al 2023;Wei et al 2021;Zhang et al 2020;Sayin et al 2021;Huang et al 2022;Cui and Du 2022;Perolat et al 2015;Zeng, Doan, and Romberg 2022;Pattathil, Zhang, and Ozdaglar 2023;Yang and Ma 2023), albeit with a few exceptions (Qin and Etesami 2023;Sayin 2023;Giannou et al 2022;Kalogiannis and Panageas 2023;Kalogiannis et al 2023;Park, Zhang, and Ozdaglar 2023). In general-sum multi-player games, in light of the intractability of Nash equilibria, most focus has been on computing or indeed learning (coarse) correlated equilibria (Daskalakis, Golowich, and Zhang 2023;Jin et al 2021;Erez et al 2023;Liu, Szepesvári, and Jin 2022;Zhang et al 2022a).…”
Section: Further Related Workmentioning
confidence: 99%
“…Computing and learning equilibria in Markov games has attracted considerable interest recently. Most focus has been on the Nash equilibrium in either identical-interestor more generally, potential-games (Fox et al 2022;Leonardos et al 2022;Aydin and Eksin 2023;Ding et al 2022;Zhang et al 2022b), or two-player zero-sum Markov games (Daskalakis, Foster, and Golowich 2020;Cen et al 2023;Wei et al 2021;Zhang et al 2020;Sayin et al 2021;Huang et al 2022;Cui and Du 2022;Perolat et al 2015;Zeng, Doan, and Romberg 2022;Pattathil, Zhang, and Ozdaglar 2023;Yang and Ma 2023), albeit with a few exceptions (Qin and Etesami 2023;Sayin 2023;Giannou et al 2022;Kalogiannis and Panageas 2023;Kalogiannis et al 2023;Park, Zhang, and Ozdaglar 2023). In general-sum multi-player games, in light of the intractability of Nash equilibria, most focus has been on computing or indeed learning (coarse) correlated equilibria (Daskalakis, Golowich, and Zhang 2023;Jin et al 2021;Erez et al 2023;Liu, Szepesvári, and Jin 2022;Zhang et al 2022a).…”
Section: Further Related Workmentioning
confidence: 99%
“…See Section 3.1 for more details. The error induced from such non-zero-sum structure appears in existing work Sayin et al (2021Sayin et al ( , 2022a, and was handled by designing a novel truncated Lyapunov function. However, the truncated Lyapunov function was sufficient to establish the asymptotic convergence, but did not provide the explicit rate at which the induced error goes to zero.…”
Section: Challenges and Our Techniquesmentioning
confidence: 99%
“…Best-response type independent learning for stochastic games has attracted increasing attention lately (Leslie et al, 2020;Sayin et al, 2021Sayin et al, , 2022aBaudin and Laraki, 2022b,a;Maheshwari et al, 2022), with Sayin et al (2021Sayin et al ( , 2022a; Baudin and Laraki (2022b,a) tackling the zero-sum setting. However, only asymptotic convergence was established in these works.…”
Section: Sample-efficient Marlmentioning
confidence: 99%
“…Assuming a blockchain system with high throughput and high capacity, this paper proposes a future vision for a decentralized game (degame) in which gamefi is transformed into degame by weakening its financial nature and highlighting its gameplay [10].…”
Section: A Conception For a De-game With Achievable Performancementioning
confidence: 99%