Keepaway Soccer: From Machine Learning Testbed to Benchmark

Stone, Peter; Kuhlmann, Gregory; Taylor, Matthew E.; Liu, Yaxin

doi:10.1007/11780519_9

Cited by 93 publications

(99 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Many function approximators have been used, including neural networks, CMACs, and radial basis functions [25]. In this paper we use a radial basis function approximator (RBF), a method with previous empirical successes [13,22].…”

Section: Sarsamentioning

confidence: 99%

“…Keepaway is part of the open source RoboCup Soccer Server [10], and we set parameters the same as in our past research [22,23]. RoboCup simulated soccer is well understood as it has been the basis of multiple international competitions and research challenges.…”

Section: The Benchmark Keepaway Taskmentioning

confidence: 99%

“…Figure 2 depicts three keepers playing against two takers. All our experiments are run on a code base derived from version 0.6 of the benchmark Keepaway implementation 2 [22].…”

Section: The Benchmark Keepaway Taskmentioning

confidence: 99%

“…These comparisons are conducted in 3 vs. 2 Keepaway [22], a standard RL benchmark domain based on robot soccer in which agents have noise in both their sensors and actuators. Keepaway is an appealing platform for empirical comparisons because the performance of TD methods in it has already been established in previous studies [8,23].…”

Section: Introductionmentioning

confidence: 99%

“…While GAs have been applied to variations of Keepaway [7,26], they have never been applied to the benchmark version of the task. We compare NEAT to Sarsa with radial basis function approximators, the best performing TD method to date [22]. Our results in this domain demonstrate that NEAT discovers better policies, though it requires many more evaluations to do so.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Comparing evolutionary and temporal difference methods in a reinforcement learning domain

Taylor

Whiteson

Stone

2006

Proceedings of the 8th Annual Conference on Genetic and Evolutionary Computation

Self Cite

View full text Add to dashboard Cite

Both genetic algorithms (GAs) and temporal difference (TD) methods have proven effective at solving reinforcement learning (RL) problems. However, since few rigorous empirical comparisons have been conducted, there are no general guidelines describing the methods' relative strengths and weaknesses. This paper presents the results of a detailed empirical comparison between a GA and a TD method in Keepaway, a standard RL benchmark domain based on robot soccer. In particular, we compare the performance of NEAT [19], a GA that evolves neural networks, with Sarsa [16, 17], a popular TD method. The results demonstrate that NEAT can learn better policies in this task, though it requires more evaluations to do so. Additional experiments in two variations of Keepaway demonstrate that Sarsa learns better policies when the task is fully observable and NEAT learns faster when the task is deterministic. Together, these results help isolate the factors critical to the performance of each method and yield insights into their general strengths and weaknesses.

show abstract

Section: Sarsamentioning

confidence: 99%

Section: The Benchmark Keepaway Taskmentioning

confidence: 99%

“…Figure 2 depicts three keepers playing against two takers. All our experiments are run on a code base derived from version 0.6 of the benchmark Keepaway implementation 2 [22].…”

Section: The Benchmark Keepaway Taskmentioning

confidence: 99%