Polygames: Improved zero learning

Cazenave, Tristan; Chen, Yen‐Chi; Chen, Guan‐Wei; Chen, Shi-Yu; Chiu, Xian-Dong; Dehos, Julien; Elsa, Maria; Gong, Qucheng; Hu, Hengyuan; Khalidov, Vasil; Li, Cheng-Ling; Lin, Hsin-I; Lin, Yu-Jin; Martinet, Xavier; Mella, Vegard; Rapin, Jérémy; Rozière, Baptiste; Synnaeve, Gabriel; Teytaud, Fabien; Teytaud, Olivier; Ye, Shi-Cheng; Ye, Yi-Jun; Yen, Shi-Jim; Zagoruyko, Sergey

doi:10.3233/icg-200157

Cited by 32 publications

(44 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In recent years, this is often based on combinations of Monte-Carlo Tree Search (MCTS) (Kocsis & Szepesvári, 2006;Coulom, 2007b;Browne et al, 2012) and deep neural networks (DNNs) (LeCun, Bengio, & Hinton, 2015). In principle this combination of techniques can be successfully applied (Silver et al, 2016;Anthony et al, 2017;Silver et al, 2017;Lorentz & Zosa IV, 2017;Silver et al, 2018;Tian et al, 2019;Morandin et al, 2019;Wu, 2019;Cazenave et al, 2020;Cazenave, 2020) to a wide variety of games. However, in practice the high computational requirements make it infeasible to scale this up to large-scale studies that involve training agents for hundreds or thousands of distinct games (Stephenson, Crist, & Browne, 2020), in addition to possibly many more variants of games generated automatically as possible reconstructions of games with incomplete rules (Browne, 2018;Browne et al, 2019b).…”

Section: Introductionmentioning

confidence: 99%

Biasing MCTS with Features for General Games

Soemers

Piette

Browne

2019

2019 IEEE Congress on Evolutionary Computation (CEC)

View full text Add to dashboard Cite

This paper proposes using a linear function approximator, rather than a deep neural network (DNN), to bias a Monte Carlo tree search (MCTS) player for general games. This is unlikely to match the potential raw playing strength of DNNs, but has advantages in terms of generality, interpretability and resources (time and hardware) required for training. Features describing local patterns are used as inputs. The features are formulated in such a way that they are easily interpretable and applicable to a wide range of general games, and might encode simple local strategies. We gradually create new features during the same self-play training process used to learn feature weights. We evaluate the playing strength of an MCTS player biased by learnt features against a standard upper confidence bounds for trees (UCT) player in multiple different board games, and demonstrate significantly improved playing strength in the majority of them after a small number of self-play training games.

show abstract

Section: Introductionmentioning

confidence: 99%

Biasing MCTS with Features for General Games

Soemers

Piette

Browne

2019

2019 IEEE Congress on Evolutionary Computation (CEC)

View full text Add to dashboard Cite

show abstract

“…Self-play training approaches such as those popularised by AlphaGo Zero (Silver et al, 2017) and AlphaZero (Silver et al, 2018), based on combinations of Monte-Carlo tree search (MCTS) (Kocsis and Szepesvári, 2006;Coulom, 2007;Browne et al, 2012) and Deep Learning (LeCun et al, 2015), have been demonstrated to be fairly generally applicable, and achieved state-of-the-art results in a variety of board games such as Go (Silver et al, 2017), Chess, Shogi (Silver et al, 2018), Hex, and Havannah (Cazenave et al, 2020). These approaches require relatively little domain knowledge, but still require some in the form of:…”

Section: Introductionmentioning

confidence: 99%

“…In this paper, we describe how we combine the GGP system Ludii and the PyTorch-based (Paszke et al, 2019) state-of-the-art training algorithms in Polygames (Cazenave et al, 2020), with the goal of mitigating all three of the requirements for domain knowledge listed above. Section 2 provides some background information on these training techniques.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Deep learning for general game playing with Ludii and Polygames

Soemers

Mella²,

Browne

et al. 2022

ICG

Self Cite

View full text Add to dashboard Cite

Combinations of Monte-Carlo tree search and Deep Neural Networks, trained through self-play, have produced state-of-the-art results for automated game-playing in many board games. The training and search algorithms are not game-specific, but every individual game that these approaches are applied to still requires domain knowledge for the implementation of the game’s rules, and constructing the neural network’s architecture – in particular the shapes of its input and output tensors. Ludii is a general game system that already contains over 1,000 different games, which can rapidly grow thanks to its powerful and user-friendly game description language. Polygames is a framework with training and search algorithms, which has already produced superhuman players for several board games. This paper describes the implementation of a bridge between Ludii and Polygames, which enables Polygames to train and evaluate models for games that are implemented and run through Ludii. We do not require any game-specific domain knowledge anymore, and instead leverage our domain knowledge of the Ludii system and its abstract state and move representations to write functions that can automatically determine the appropriate shapes for input and output tensors for any game implemented in Ludii. We describe experimental results for short training runs in a wide variety of different board games, and discuss several open problems and avenues for future research.

show abstract

“…• Breakthrough: DaSoJai (author: Wei-Lin Wu and Shun-Shii Lin) and Polygames (Facebook NDHU). Polygames (Cazenave et al, 2020) is a reimplementation of AlphaZero. • Amazons: 8QP (Johan de Koning) and SherlockGo (Liang Shuang, Liang Tailin, Wang Jilong, Li Xiaorui, and Zhou Ke).…”

mentioning

confidence: 99%