2010
DOI: 10.1007/s10994-010-5223-6
|View full text |Cite
|
Sign up to set email alerts
|

Policy search for motor primitives in robotics

Abstract: Many motor skills in humanoid robotics can be learned using parametrized motor primitives. While successful applications to date have been achieved with imitation learning, most of the interesting motor learning problems are high-dimensional reinforcement learning problems. These problems are often beyond the reach of current reinforcement learning methods. In this paper, we study parametrized policy search methods and apply these to benchmark problems of motor primitive learning in robotics. We show that many… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
283
0
1

Year Published

2016
2016
2023
2023

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 294 publications
(284 citation statements)
references
References 31 publications
0
283
0
1
Order By: Relevance
“…There exist several algorithms which use probabilistic inference techniques for computing the policy update in reinforcement learning (Dayan and Hinton 1993;Theodorou et al 2010;Kober and Peters 2010;Peters et al 2010). More formally, they either re-weight state-action trajectories or state-action pairs according to the estimated quality of the state-action pair and, subsequently, use a weighted maximum likelihood estimate to obtain the parameters of a new policy π * .…”
Section: Probabilistic Reinforcement Learning Algorithmsmentioning
confidence: 99%
See 3 more Smart Citations
“…There exist several algorithms which use probabilistic inference techniques for computing the policy update in reinforcement learning (Dayan and Hinton 1993;Theodorou et al 2010;Kober and Peters 2010;Peters et al 2010). More formally, they either re-weight state-action trajectories or state-action pairs according to the estimated quality of the state-action pair and, subsequently, use a weighted maximum likelihood estimate to obtain the parameters of a new policy π * .…”
Section: Probabilistic Reinforcement Learning Algorithmsmentioning
confidence: 99%
“…The parameter η is a temperature parameter that is either optimized by the algorithm Daniel et al 2012) or manually set (Theodorou et al 2010;Kober and Peters 2010). A new parametrized policy π * can then be obtained by minimizing the expected Kulback-Leibler divergence between the re-weighted policy update p(a|s) and the new parametric policy π * (van Hoof et al 2015), i.e.,…”
Section: Probabilistic Reinforcement Learning Algorithmsmentioning
confidence: 99%
See 2 more Smart Citations
“…In most applications to date, only a single motion primitive is used for the whole movement. Parametrized policy search methods such as policy gradient descent and EM-like policy updates (Kober & Peters, 2009) have been used in order to improve single-stroke motor primitives.…”
Section: Introductionmentioning
confidence: 99%