2021
DOI: 10.3233/icg-200157
|View full text |Cite
|
Sign up to set email alerts
|

Polygames: Improved zero learning

Abstract: Since DeepMind’s AlphaZero, Zero learning quickly became the state-of-the-art method for many board games. It can be improved using a fully convolutional structure (no fully connected layer). Using such an architecture plus global pooling, we can create bots independent of the board size. The training can be made more robust by keeping track of the best checkpoints during the training and by training against them. Using these features, we release Polygames, our framework for Zero learning, with its library of … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
44
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2
1

Relationship

2
6

Authors

Journals

citations
Cited by 32 publications
(44 citation statements)
references
References 11 publications
0
44
0
Order By: Relevance
“…In recent years, this is often based on combinations of Monte-Carlo Tree Search (MCTS) (Kocsis & Szepesvári, 2006;Coulom, 2007b;Browne et al, 2012) and deep neural networks (DNNs) (LeCun, Bengio, & Hinton, 2015). In principle this combination of techniques can be successfully applied (Silver et al, 2016;Anthony et al, 2017;Silver et al, 2017;Lorentz & Zosa IV, 2017;Silver et al, 2018;Tian et al, 2019;Morandin et al, 2019;Wu, 2019;Cazenave et al, 2020;Cazenave, 2020) to a wide variety of games. However, in practice the high computational requirements make it infeasible to scale this up to large-scale studies that involve training agents for hundreds or thousands of distinct games (Stephenson, Crist, & Browne, 2020), in addition to possibly many more variants of games generated automatically as possible reconstructions of games with incomplete rules (Browne, 2018;Browne et al, 2019b).…”
Section: Introductionmentioning
confidence: 99%
“…In recent years, this is often based on combinations of Monte-Carlo Tree Search (MCTS) (Kocsis & Szepesvári, 2006;Coulom, 2007b;Browne et al, 2012) and deep neural networks (DNNs) (LeCun, Bengio, & Hinton, 2015). In principle this combination of techniques can be successfully applied (Silver et al, 2016;Anthony et al, 2017;Silver et al, 2017;Lorentz & Zosa IV, 2017;Silver et al, 2018;Tian et al, 2019;Morandin et al, 2019;Wu, 2019;Cazenave et al, 2020;Cazenave, 2020) to a wide variety of games. However, in practice the high computational requirements make it infeasible to scale this up to large-scale studies that involve training agents for hundreds or thousands of distinct games (Stephenson, Crist, & Browne, 2020), in addition to possibly many more variants of games generated automatically as possible reconstructions of games with incomplete rules (Browne, 2018;Browne et al, 2019b).…”
Section: Introductionmentioning
confidence: 99%
“…Self-play training approaches such as those popularised by AlphaGo Zero (Silver et al, 2017) and AlphaZero (Silver et al, 2018), based on combinations of Monte-Carlo tree search (MCTS) (Kocsis and Szepesvári, 2006;Coulom, 2007;Browne et al, 2012) and Deep Learning (LeCun et al, 2015), have been demonstrated to be fairly generally applicable, and achieved state-of-the-art results in a variety of board games such as Go (Silver et al, 2017), Chess, Shogi (Silver et al, 2018), Hex, and Havannah (Cazenave et al, 2020). These approaches require relatively little domain knowledge, but still require some in the form of:…”
Section: Introductionmentioning
confidence: 99%
“…In this paper, we describe how we combine the GGP system Ludii and the PyTorch-based (Paszke et al, 2019) state-of-the-art training algorithms in Polygames (Cazenave et al, 2020), with the goal of mitigating all three of the requirements for domain knowledge listed above. Section 2 provides some background information on these training techniques.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…• Breakthrough: DaSoJai (author: Wei-Lin Wu and Shun-Shii Lin) and Polygames (Facebook NDHU). Polygames (Cazenave et al, 2020) is a reimplementation of AlphaZero. • Amazons: 8QP (Johan de Koning) and SherlockGo (Liang Shuang, Liang Tailin, Wang Jilong, Li Xiaorui, and Zhou Ke).…”
mentioning
confidence: 99%