2009
DOI: 10.1007/s10514-009-9133-z
|View full text |Cite
|
Sign up to set email alerts
|

Nonparametric representation of an approximated Poincaré map for learning biped locomotion

Abstract: We propose approximating a Poincaré map of biped walking dynamics using Gaussian processes. We locally optimize parameters of a given biped walking controller based on the approximated Poincaré map. By using Gaussian processes, we can estimate a probability distribution of a target nonlinear function with a given covariance. Thus, an optimization method can take the uncertainty of approximated maps into account throughout the learning process. We use a reinforcement learning (RL) method as the optimization met… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
10
0

Year Published

2010
2010
2023
2023

Publication Types

Select...
5
4
1

Relationship

1
9

Authors

Journals

citations
Cited by 30 publications
(10 citation statements)
references
References 35 publications
(33 reference statements)
0
10
0
Order By: Relevance
“…In this field, learning techniques are mostly applied to improve walking policies that were derived from simulations [7] or observed from human walking [8]. Furthermore, due to the complexity of biped locomotion, every walking robot is equipped with many sensors to determine its stance and to allow closed-loop control.…”
Section: Related Workmentioning
confidence: 99%
“…In this field, learning techniques are mostly applied to improve walking policies that were derived from simulations [7] or observed from human walking [8]. Furthermore, due to the complexity of biped locomotion, every walking robot is equipped with many sensors to determine its stance and to allow closed-loop control.…”
Section: Related Workmentioning
confidence: 99%
“…However, in more dynamic control problems, even without considering reward functions at the moment, the robot has to consider a highdimensional state space and estimate a hidden variable online. In such situations, the complex external dynamics need to be reduced using sophisticated methods (Stephens, 2007;Morimoto & Atkeson, 2009;Kajita, Kanehiro, Kaneko, Fujiwara, & Yokoi, 2003). To apply these techniques developed in robotics to the RL dynamics, it is convenient for RL robots to have a modular architecture in which dynamics and reward can be addressed independently.…”
Section: Introductionmentioning
confidence: 99%
“…However, due to the nonlinear dynamic property of the coupled oscillator system composed of the CPG controller and the robot, it is rather difficult to analytically design the biped trajectory to satisfy the requirements of a target walking pattern. Therefore, using a nonlinear optimization method to improve the walking trajectory is reasonable [6], [7]. To optimize the walking trajectory, a model-free optimal control method is preferable because precisely modeling the ground contact is difficult.…”
Section: Introductionmentioning
confidence: 99%