Recursive Monte Carlo search for imperfect information games

Furtak, Timothy; Buro, Michael

doi:10.1109/cig.2013.6633646

Cited by 29 publications

(21 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Monte Carlo Search is common used for card games as well, as they commonly feature imperfect information (cf. [8,20]). Another route would be the implementation of AIs that imitate human players.…”

Section: Discussionmentioning

confidence: 99%

Demonstrating the Feasibility of Automatic Game Balancing

Volz

Rudolph

Naujoks

2016

Proceedings of the Genetic and Evolutionary Computation Conference 2016

View full text Add to dashboard Cite

Game balancing is an important part of the (computer) game design process, in which designers adapt a game prototype so that the resulting gameplay is as entertaining as possible. In industry, the evaluation of a game is often based on costly playtests with human players. It suggests itself to automate this process using surrogate models for the prediction of gameplay and outcome. In this paper, the feasibility of automatic balancing using simulation-and deck-based objectives is investigated for the card game top trumps. Additionally, the necessity of a multi-objective approach is asserted by a comparison with the only known (single-objective) method. We apply a multi-objective evolutionary algorithm to obtain decks that optimise objectives, e.g. win rate and average number of tricks, developed to express the fairness and the excitement of a game of top trumps. The results are compared with decks from published top trumps decks using simulation-based objectives. The possibility to generate decks better or at least as good as decks from published top trumps decks in terms of these objectives is demonstrated. Our results indicate that automatic balancing with the presented approach is feasible even for more complex games such as real-time strategy games.

show abstract

Section: Discussionmentioning

confidence: 99%

Demonstrating the Feasibility of Automatic Game Balancing

Volz

Rudolph

Naujoks

2016

Proceedings of the Genetic and Evolutionary Computation Conference 2016

View full text Add to dashboard Cite

show abstract

“…It grows a tree over information sets for each player instead of constructing a separate tree for each determinization. ISMCTS suffered from the information leaking problem, as pointed out by [Furtak and Buro, 2013]. introduced Self-play MCTS and used a separate search tree for each player.…”

Section: Related Workmentioning

confidence: 99%

DeltaDou: Expert-level Doudizhu AI through Self-play

Jiang¹,

Li²,

Du³

et al. 2019

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence

View full text Add to dashboard Cite

Artificial Intelligence has seen several breakthroughs in two-player perfect information game. Nevertheless, Doudizhu, a three-player imperfect information game, is still quite challenging. In this paper, we present a Doudizhu AI by applying deep reinforcement learning from games of self-play. The algorithm combines an asymmetric MCTS on nodes of information set of each player, a policy-value network that approximates the policy and value on each decision node, and inference on unobserved hands of other players by given policy. Our results show that self-play can significantly improve the performance of our agent in this multi-agent imperfect information game. Even starting with a weak AI, our agent can achieve human expert level after days of self-play and training.

show abstract

“…Despite its shortcomings, Perfect Information Monte-Carlo (PIMC) Search [7] continues be the state-of-the-art cardplay method for Skat and other trick-taking card games like Bridge [8] and Hearts [9]. Later, Imperfect Information Monte-Carlo Search [10] and Information Set Monte Carlo Tree Search [11] sought to address some of the issues inherent in PIMC while still relying on the use of state determinization and a forward model.…”

Section: B Previous Workmentioning

confidence: 99%

Learning Policies from Human Data for Skat

Rebstock

Solinas

Buro

2019

2019 IEEE Conference on Games (CoG)

Self Cite

View full text Add to dashboard Cite

Decision-making in large imperfect information games is difficult. Thanks to recent success in Poker, Counterfactual Regret Minimization (CFR) methods have been at the forefront of research in these games. However, most of the success in large games comes with the use of a forward model and powerful state abstractions. In trick-taking card games like Bridge or Skat, large information sets and an inability to advance the simulation without fully determinizing the state make forward search problematic. Furthermore, state abstractions can be especially difficult to construct because the precise holdings of each player directly impact move values.In this paper we explore learning model-free policies for Skat from human game data using deep neural networks (DNN). We produce a new state-of-the-art system for bidding and game declaration by introducing methods to a) directly vary the aggressiveness of the bidder and b) declare games based on expected value while mitigating issues with rarely observed state-action pairs. Although cardplay policies learned through imitation are slightly weaker than the current best search-based method, they run orders of magnitude faster. We also explore how these policies could be learned directly from experience in a reinforcement learning setting and discuss the value of incorporating human data for this task.

show abstract

Recursive Monte Carlo search for imperfect information games

Cited by 29 publications

References 12 publications

Demonstrating the Feasibility of Automatic Game Balancing

Demonstrating the Feasibility of Automatic Game Balancing

DeltaDou: Expert-level Doudizhu AI through Self-play

Learning Policies from Human Data for Skat

Contact Info

Product

Resources

About