2020
DOI: 10.48550/arxiv.2007.13544
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Combining Deep Reinforcement Learning and Search for Imperfect-Information Games

Noam Brown,
Anton Bakhtin,
Adam Lerer
et al.

Abstract: The combination of deep reinforcement learning and search at both training and test time is a powerful paradigm that has led to a number of a successes in singleagent settings and perfect-information games, best exemplified by the success of AlphaZero. However, algorithms of this form have been unable to cope with imperfect-information games. This paper presents ReBeL, a general framework for self-play reinforcement learning and search for imperfect-information games. In the simpler setting of perfect-informat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
1

Relationship

2
5

Authors

Journals

citations
Cited by 9 publications
(12 citation statements)
references
References 28 publications
0
11
0
1
Order By: Relevance
“…We have shown that many We see at least three benefits of the public-state formulation of CFR. The first is conceptual: Many recent extensions of CFR (e.g., [2,19]) heavily rely on decomposition and public states. Since PS-CFR is also formulated in terms of public states, PS-CFR serves as a much more suitable basis for these extensions than Vanilla-CFR.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…We have shown that many We see at least three benefits of the public-state formulation of CFR. The first is conceptual: Many recent extensions of CFR (e.g., [2,19]) heavily rely on decomposition and public states. Since PS-CFR is also formulated in terms of public states, PS-CFR serves as a much more suitable basis for these extensions than Vanilla-CFR.…”
Section: Discussionmentioning
confidence: 99%
“…The joint probability distribution over types is common knowledge, but each player only knows their own type. (2) The players play a perfect information extensiveform game. (3) Rewards in this game depend not only on actions taken but also on the private types of all players.…”
Section: Introductionmentioning
confidence: 99%
“…RL methods have achieved superhuman levels of performance in sequential decision making, famously for arcade games (Mnih et al, 2013), board games (Silver et al, 2017), online multiplayer games (Berner et al, 2019) and imperfect-information games (Brown et al, 2020). These successes demonstrated RL's ability to operate on ambiguous data, understand complex environments and infer high-level causal relationships.…”
Section: Reinforcement Learning Reasoningmentioning
confidence: 99%
“…Nash equilibrium [39], an important concept in game theory, which is the best strategy for any player no matter what strategies the other players chose. Due to the above characteristic, researchers have paid much attention on approaching Nash equilibrium [40], [41]. Tree search methods have long been a mainstream for turn based games.…”
Section: How To Reach Nash Equilibrium?mentioning
confidence: 99%