2006 IEEE Symposium on Computational Intelligence and Games 2006
DOI: 10.1109/cig.2006.311681
|View full text |Cite
|
Sign up to set email alerts
|

Temporal Difference Learning Versus Co-Evolution for Acquiring Othello Position Evaluation

Abstract: Abstract-This paper compares the use of temporal difference learning (TDL) versus co-evolutionary learning (CEL) for acquiring position evaluation functions for the game of Othello. The paper provides important insights into the strengths and weaknesses of each approach. The main findings are that for Othello, TDL learns much faster than CEL, but that properly tuned CEL can learn better playing strategies. For CEL, it is essential to use parent-child weighted averaging in order to achieve good performance. Usi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

4
77
0

Year Published

2009
2009
2022
2022

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 46 publications
(81 citation statements)
references
References 8 publications
4
77
0
Order By: Relevance
“…Othello), which is a deterministic, perfect information, zero-sum game for two players, has been studied by the AI community [11,12,20,24,25,32]. The game's goal is to control a majority of the pieces at the end of the game by forcing as many of your opponent's pieces to be turned over on an 8 × 8 board as possible.…”
Section: Reversimentioning
confidence: 99%
See 1 more Smart Citation
“…Othello), which is a deterministic, perfect information, zero-sum game for two players, has been studied by the AI community [11,12,20,24,25,32]. The game's goal is to control a majority of the pieces at the end of the game by forcing as many of your opponent's pieces to be turned over on an 8 × 8 board as possible.…”
Section: Reversimentioning
confidence: 99%
“…Adding a noise to the evaluation is for the reason that we would like to collect a variety of game trajectories. The weight w i of HEUR is determined manually while that of COEV is optimized by a co-evolutionary computation method [11]. Every policy repeatedly played against every other, and then the state transitions were retrieved from the game trajectories of the winners.…”
Section: Reversimentioning
confidence: 99%
“…However, they find at least one setup, using coevolution, wherein evolution outperforms TD. They also present results for Othello [38], finding that TD methods are much faster but that a properly tuned evolutionary method ultimately performs best. Lucas and Togelius [39] present similar comparative results in a simple car-racing domain.…”
Section: Related Workmentioning
confidence: 99%
“…Fortunately, the game rules are flexible enough to be easily adapted to smaller boards without loss of the underlying 'spirit' of the game, so in a great part of studies on computer Go the board is downgraded to 9 × 9 or 5 × 5. Following Lucas and Runarsson (2006) as well as Lubberts and Miikkulainen (2001), we consider playing Go on a 5 × 5 board (see Fig. 1).…”
Section: Adopted Computer Go Rulesmentioning
confidence: 99%