2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids) 2017
DOI: 10.1109/humanoids.2017.8246900
|View full text |Cite
|
Sign up to set email alerts
|

Emergence of human-comparable balancing behaviours by deep reinforcement learning

Abstract: Abstract-This paper presents a hierarchical framework based on deep reinforcement learning that learns a diversity of policies for humanoid balance control. Conventional zero moment point based controllers perform limited actions during under-actuation, whereas the proposed framework can perform human-like balancing behaviors such as active push-off of ankles. The learning is done through the design of an explainable reward based on physical constraints. The simulated results are presented and analyzed. The su… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
6
4

Relationship

2
8

Authors

Journals

citations
Cited by 16 publications
(12 citation statements)
references
References 21 publications
0
12
0
Order By: Relevance
“…The total reward is composed of a weighted sum of scalar components i ω i r i , where r i is the reward term and w i its weight. In order to provide a similar scale for each of them, and therefore improving the interpretability of the total reward, we process the real and vector components with a Radial Basis Function (RBF) kernel [41] with a dimension given by a cutoff parameter calculated from the desired sensitivity. Appendix A provides a more detailed description of the kernel.…”
Section: Rewardmentioning
confidence: 99%
“…The total reward is composed of a weighted sum of scalar components i ω i r i , where r i is the reward term and w i its weight. In order to provide a similar scale for each of them, and therefore improving the interpretability of the total reward, we process the real and vector components with a Radial Basis Function (RBF) kernel [41] with a dimension given by a cutoff parameter calculated from the desired sensitivity. Appendix A provides a more detailed description of the kernel.…”
Section: Rewardmentioning
confidence: 99%
“…The learning task JPL and WSC were inspired by similar works [Yang et al 2017, Yang et al 2018 where the simulated robot was periodically pushed while trying to keep upright. In this work we faced a new challenge: trying to learn a Push Recovery policy while the robot runs.…”
Section: Running Agent With No Priors -Ranpmentioning
confidence: 99%
“…The design of the reward function is a crucial part in reinforcement learning as the reward governs the outcome behavior. The reward design follows a similar design rule as in [29]. Balancing can be divided into four subtasks: regulating upper body pose, regulating CoM position, regulating CoM velocity, and regulating ground contact force.…”
Section: Design Of Reward Functionmentioning
confidence: 99%