2007
DOI: 10.3233/icg-2007-30403
|View full text |Cite
|
Sign up to set email alerts
|

Computing “Elo Ratings” of Move Patterns in the Game of Go1

Abstract: Move patterns are an essential method to incorporate domain knowledge into Go-playing programs. This paper presents a new Bayesian technique for supervised learning of such patterns from game records, based on a generalization of Elo ratings. Each sample move in the training data is considered as a victory of a team of pattern features. Elo ratings of individual pattern features are computed from these victories, and can be used in previously unseen positions to compute a probability distribution over legal mo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
206
0
6

Year Published

2010
2010
2021
2021

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 212 publications
(213 citation statements)
references
References 11 publications
1
206
0
6
Order By: Relevance
“…This algorithm has the advantage that it is anytime: we do not have to know in advance at which value of t the algorithm will be stopped. [3] applied it successfully in the very efficient CrazyStone implementation of Monte-Carlo Tree Search [4]. Upper Confidence Tree (or Monte-Carlo Tree Search) is not a simple setting as above: when applying an option, we reach a new state; one can think of Monte-Carlo Tree Search (or UCT) as having one bandit in each possible state s of the reinforcement learning problem, for choosing between (infinitely many) options o 1 (s), o 2 (s), .…”
Section: Progressive Wideningmentioning
confidence: 99%
See 1 more Smart Citation
“…This algorithm has the advantage that it is anytime: we do not have to know in advance at which value of t the algorithm will be stopped. [3] applied it successfully in the very efficient CrazyStone implementation of Monte-Carlo Tree Search [4]. Upper Confidence Tree (or Monte-Carlo Tree Search) is not a simple setting as above: when applying an option, we reach a new state; one can think of Monte-Carlo Tree Search (or UCT) as having one bandit in each possible state s of the reinforcement learning problem, for choosing between (infinitely many) options o 1 (s), o 2 (s), .…”
Section: Progressive Wideningmentioning
confidence: 99%
“…Progressive strategies have been proposed in [4,2] for tackling problems with big action spaces; they have been theoretically analyzed in [13], and used for continuous spaces in [11,12]. We will here (i) define a variant of progressive widening (section 2.1), (ii) show why it can't be directly applied in some cases (section 2.2), (iii) define our version (section 2.3).…”
Section: Introductionmentioning
confidence: 99%
“…Many people have tried to improve the MC engine by increasing its level (the strength of the Monte-Carlo simulator as a standalone player), but it is shown clearly in [13,10] that this is not the good criterion: a MC engine M C 1 which plays significantly better than another M C 2 can lead to very poor results as a module in MCTS, whenever the computational cost is the same. Some MC engines have been learnt on datasets [8], but the results are strongly improved by changing the constants manually. In that sense, designing and calibrating a MC engine remains an open challenge: one has to intensively experiment a modification in order to validate it.…”
Section: Improving Monte-carlo (Mc) Simulationsmentioning
confidence: 99%
“…It has been greatly improved by including Progressive Widening and Double Progressive Widening [6,2], RAVE values [7], Blind Values [4], and handcrafted Monte-Carlo moves [17,10]. A crucial component is the Monte-Carlo move generator, also known as the playout generator.…”
Section: Introductionmentioning
confidence: 99%