Nolan Bard scite author profile

Artificial intelligence has seen several breakthroughs in recent years, with games often serving as milestones. A common feature of these games is that players have perfect information. Poker, the quintessential game of imperfect information, is a long-standing challenge problem in artificial intelligence. We introduce DeepStack, an algorithm for imperfect-information settings. It combines recursive reasoning to handle information asymmetry, decomposition to focus computation on the relevant decision, and a form of intuition that is automatically learned from self-play using deep learning. In a study involving 44,000 hands of poker, DeepStack defeated, with statistical significance, professional poker players in heads-up no-limit Texas hold'em. The approach is theoretically sound and is shown to produce strategies that are more difficult to exploit than prior approaches.

show abstract

The Hanabi challenge: A new frontier for AI research

Bard¹,

Foerster

Chandar

et al. 2020

Artificial Intelligence

166

128

View full text Add to dashboard Cite

From the early days of computing, games have been important testbeds for studying how well machines can do sophisticated decision making. In recent years, machine learning has made dramatic advances with artificial agents reaching superhuman performance in challenge domains like Go, Atari, and some variants of poker. As with their predecessors of chess, checkers, and backgammon, these game domains have driven research by providing sophisticated yet well-defined challenges for artificial intelligence practitioners. We continue this tradition by proposing the game of Hanabi as a new challenge domain with novel problems that arise from its combination of purely cooperative gameplay with two to five players and imperfect information. In particular, we argue that Hanabi elevates reasoning about the beliefs and intentions of other agents to the foreground. We believe developing novel techniques for such theory of mind reasoning will not only be crucial for success in Hanabi, but also in broader collaborative efforts, especially those with human partners. To facilitate future research, we introduce the open-source Hanabi Learning Environment, propose an experimental framework for the research community to evaluate algorithmic advances, and assess the performance of current state-of-the-art techniques. 6 One such equilibrium occurs when players do not intentionally communicate information to other players, and ignore what other players tell them (historically called a pooling equilibrium in pure signalling games [15], or a babbling equilibrium in later work using cheap talk [16]). In this case, there is no incentive for a player to start communicating because they will be ignored, and there is no incentive to pay attention to other players because they are not communicating.7 In pure signalling games where actions are purely communicative, policies are often referred to as communication protocols. Though Hanabi is not such a pure signalling game, when we want to emphasize the communication properties of an agent's policy we will still refer to its communication protocol. 8 We use the word convention to refer to the parts of a communication protocol or policy that interrelate. Technically, these can be thought of as constraints on the policy to enact the convention.

show abstract

Player of Games

Schmid¹,

Moravčík²,

Burch³

et al. 2021

Preprint

View full text Add to dashboard Cite

Games have a long history of serving as a benchmark for progress in artificial intelligence. Recently, approaches using search and learning have shown strong performance across a set of perfect information games, and approaches using game-theoretic reasoning and learning have shown strong performance for specific imperfect information poker variants. We introduce Player of Games, a general-purpose algorithm that unifies previous approaches, combining guided search, self-play learning, and game-theoretic reasoning. Player of Games is the first algorithm to achieve strong empirical performance in large perfect and imperfect information games -an important step towards truly general algorithms for arbitrary environments. We prove that Player of Games is sound, converging to perfect play as available computation time and approximation capacity increases. Player of Games reaches strong performance in chess and Go, beats the strongest openly available agent in heads-up no-limit Texas hold'em poker (Slumbot), and defeats the state-of-the-art agent in Scotland Yard, an imperfect information game that illustrates the value of guided search, learning, and game-theoretic reasoning.

show abstract

Do pokers players know how good they are? Accuracy of poker skill estimation in online and offline players

MacKay

Bard

Bowling

et al. 2014

Computers in Human Behavior

View full text Add to dashboard Cite

The Annual Computer Poker Competition

Bard

Hawkin

Rubin³

et al. 2013

AI Magazine

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Nolan Bard

DeepStack: Expert-level artificial intelligence in heads-up no-limit poker

The Hanabi challenge: A new frontier for AI research

Player of Games

Do pokers players know how good they are? Accuracy of poker skill estimation in online and offline players

The Annual Computer Poker Competition

Contact Info

Product

Resources

About