2009 IEEE Symposium on Computational Intelligence and Games 2009
DOI: 10.1109/cig.2009.5286486
|View full text |Cite
|
Sign up to set email alerts
|

Coevolutionary Temporal Difference Learning for Othello

Abstract: This paper presents Coevolutionary Temporal Difference Learning (CTDL), a novel way of hybridizing coevolutionary search with reinforcement learning that works by interlacing one-population competitive coevolution with temporal difference learning. The coevolutionary part of the algorithm provides for exploration of the solution space, while the temporal difference learning performs its exploitation by local search. We apply CTDL to the board game of Othello, using weighted piece counter for representing playe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

4
20
0

Year Published

2011
2011
2023
2023

Publication Types

Select...
4
3

Relationship

2
5

Authors

Journals

citations
Cited by 24 publications
(24 citation statements)
references
References 21 publications
4
20
0
Order By: Relevance
“…To quote Arthur Lee Samuel's declaration, The temptation to improve the machine's game by giving it standard openings or other man-generated knowledge of playing techniques has been consistently resisted (Samuel, 1959, p. 215). This result confirms our former observations (Szubert et al, 2009), when we demonstrated that hybridizing coevolution with TD(0) proves beneficial when learning the strategy of the game of Othello. Here, we come to similar conclusions for the game of small-board Go, and additionally note that extending the lookahead horizon by using TD(λ) with λ close to 1 can boost the performance of CTDL even further.…”
Section: Resultssupporting
confidence: 80%
See 3 more Smart Citations
“…To quote Arthur Lee Samuel's declaration, The temptation to improve the machine's game by giving it standard openings or other man-generated knowledge of playing techniques has been consistently resisted (Samuel, 1959, p. 215). This result confirms our former observations (Szubert et al, 2009), when we demonstrated that hybridizing coevolution with TD(0) proves beneficial when learning the strategy of the game of Othello. Here, we come to similar conclusions for the game of small-board Go, and additionally note that extending the lookahead horizon by using TD(λ) with λ close to 1 can boost the performance of CTDL even further.…”
Section: Resultssupporting
confidence: 80%
“…4) speeds up the learning, the difference, initially substantial, becomes rather negligible after several hundreds of thousands of training games. Based on these results, we conclude that CTDL+HoF is moderately sensitive to the TDL-CEL ratio and recommend values greater than 8 for this parameter, which confirms our earlier findings for Othello (Szubert et al, 2009).…”
Section: Determining the Best Tdl-cel Ratiosupporting
confidence: 78%
See 2 more Smart Citations
“…The heuristics of the tested implementations were the same, incorporating the weighted piece counter and mobility metric as components of the evaluation function [5]. The weight matrix used for piece evalution was the one obtained with coevolution by Szubert [20]. The same weight matrix was also used in move ordering to select the first move in the PV-nodes on the CPU.…”
Section: Testing Setupmentioning
confidence: 99%