2009
DOI: 10.1007/s10514-009-9120-4
|View full text |Cite
|
Sign up to set email alerts
|

Reinforcement learning for robot soccer

Abstract: Batch reinforcement learning methods provide a powerful framework for learning efficiently and effectively in autonomous robots. The paper reviews some recent work of the authors aiming at the successful application of reinforcement learning in a challenging and complex domain. It discusses several variants of the general batch learning framework, particularly tailored to the use of multilayer perceptrons to approximate value functions over continuous state spaces. The batch learning framework is successfully … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
124
0
3

Year Published

2009
2009
2024
2024

Publication Types

Select...
5
4
1

Relationship

1
9

Authors

Journals

citations
Cited by 230 publications
(127 citation statements)
references
References 33 publications
0
124
0
3
Order By: Relevance
“…The reason behind this problem is the high complexity of the robots hardware and the complexity of the robot-to-robot interactions. Examples of techniques to reduce the state space dimension have been used by Riedmiller et al (2009). In this work, the authors applied neural networks as function approximators together with fast learning algorithms (Kalyanakrishnan and Stone 2007).…”
Section: Automatic Design Methodsmentioning
confidence: 99%
“…The reason behind this problem is the high complexity of the robots hardware and the complexity of the robot-to-robot interactions. Examples of techniques to reduce the state space dimension have been used by Riedmiller et al (2009). In this work, the authors applied neural networks as function approximators together with fast learning algorithms (Kalyanakrishnan and Stone 2007).…”
Section: Automatic Design Methodsmentioning
confidence: 99%
“…Some recent examples of NFQ in real-world applications are learning to swing-up and balance a real cart-pole system, time optimal position control of pneumatic devices, and learning to accurately steer a real car within less than half an hour of driving . The following briefly describes the learning of a neural dribble controller for a RoboCup MidSize League robot (for more details, see also Riedmiller et al (2009)). The autonomous robot (figure 6) uses a camera as its main sensor and is fitted with an omnidirectional drive.…”
Section: Nfq In Control Applicationsmentioning
confidence: 99%
“…Abbeel et al [2006,2007], Atkeson and Schaal [1997], Atkeson [1998] Asada et al [1996], Bakker et al [2003], Benbrahim et al [1992], Benbrahim and Franklin [1997], Birdwell and Livingston [2007], Bitzer et al [2010], Conn and Peters II [2007], Duan et al [2007Duan et al [ , 2008, Fagg et al [1998], Gaskett et al [2000], Gräve et al [2010], Hafner and Riedmiller [2007], Huang and Weng [2002], Ilg et al [1999], Katz et al [2008], Kimura et al [2001], Kirchner [1997], Kroemer et al [2009, Latzke et al [2007], Lizotte et al [2007], Mahadevan and Connell [1992], Mataric [1997], Nemec et al [2009Nemec et al [ , 2010, Oßwald et al [2010], Paletta et al [2007], Platt et al [2006], Riedmiller et al [2009], Rottmann et al [2007], Kaelbling [1998, 2002] called the value function, and use it to reconstruct the optimal policy. A wide variety of methods exist and can be split mainly into three classes: (i) dynamic programming-based optimal control approaches such as policy iteration or value iteration, (ii) rollout-based Monte Carlo methods and (iii) temporal difference methods such as TD(λ), Q-learning, and SARSA.…”
Section: Model-basedmentioning
confidence: 99%