2017
DOI: 10.1007/978-3-319-57969-6_1
|View full text |Cite
|
Sign up to set email alerts
|

NeuroHex: A Deep Q-learning Hex Agent

Abstract: DeepMind's recent spectacular success in using deep convolutional neural nets and machine learning to build superhuman level agents -e.g. for Atari games via deep Q-learning and for the game of Go via Reinforcement Learning -raises many questions, including to what extent these methods will succeed in other domains. In this paper we consider DQL for the game of Hex: after supervised initializing, we use selfplay to train NeuroHex, an 11-layer CNN that plays Hex on the 13×13 board. Hex is the classic two-player… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
24
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 9 publications
(24 citation statements)
references
References 15 publications
0
24
0
Order By: Relevance
“…MoHex2.0 plays the next move based on the playout simulations, which improves the quality using expert game records. As a result, a CNN has been proposed to create more accurate evaluation functions [24], [25]. The CNN is expected to learn position features that cannot be represented by network characteristics.…”
Section: B Conventional Methods To Build Value Functionsmentioning
confidence: 99%
See 1 more Smart Citation
“…MoHex2.0 plays the next move based on the playout simulations, which improves the quality using expert game records. As a result, a CNN has been proposed to create more accurate evaluation functions [24], [25]. The CNN is expected to learn position features that cannot be represented by network characteristics.…”
Section: B Conventional Methods To Build Value Functionsmentioning
confidence: 99%
“…Several reinforcement-learning algorithms that train value and policy functions independently have been proposed [8], Places corresponding to three mutually adjacent cells patterns become one. [24], and it has been demonstrated that CNNs provide greater evaluation accuracy than that of classical evaluation functions. In addition, the reinforcement-learning algorithm called Expert Iteration(ExIt), which trains two functions, has been proposed; the effectiveness of ExIt is shown on a 9×9 board [26].…”
Section: B Conventional Methods To Build Value Functionsmentioning
confidence: 99%
“…Deep Q-Learning differs from the other methods mentioned earlier as Q-learning is not based on the search method. The use of Q-Learning on two-player board games has been studied before in several different types of games [9,15,[21][22][23][24]. In [22], Deep Q-Learning still found it difficult to beat the searching method in the Hex game even though it has been trained for two weeks (about 60,000 episodes).…”
Section: Introductionmentioning
confidence: 99%
“…The use of Q-Learning on two-player board games has been studied before in several different types of games [9,15,[21][22][23][24]. In [22], Deep Q-Learning still found it difficult to beat the searching method in the Hex game even though it has been trained for two weeks (about 60,000 episodes). The long training process is also shown in [24].…”
Section: Introductionmentioning
confidence: 99%
“…Recent studies show that compared with traditional rectangle-based CNN models, CNN models with hexagonshaped filters achieve better performance in applications such as Imaging Atmospheric Cherenkov Telescope (IACT) data analysis [2], [19], [20], [23], Hex move-prediction [27], and IceCube data analysis [8]. Applying hexagonal filters in group CNNs can even surpass the performance of traditional CNN models with image classification tasks on data sets such as CIFAR-10 [6], [22], [26].…”
Section: Introductionmentioning
confidence: 99%