2020
DOI: 10.48550/arxiv.2011.01387
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Sim-to-Real Learning of All Common Bipedal Gaits via Periodic Reward Composition

Abstract: We study the problem of realizing the full spectrum of bipedal locomotion on a real robot with sim-to-real reinforcement learning (RL). A key challenge of learning legged locomotion is describing different gaits, via reward functions, in a way that is intuitive for the designer and specific enough to reliably learn the gait across different initial random seeds or hyperparameters. A common approach is to use reference motions (e.g. trajectories of joint positions) to guide learning. However, finding high-quali… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(8 citation statements)
references
References 20 publications
0
8
0
Order By: Relevance
“…Integrating task-specific signals into the observation space is another solution for learning multiple gaits. A recent work from Siekmann et al [28] have shown bipedal controller for multiple gaits by using a cycle time with offsets and a vector ratio as gait representation and periodic reward for training. Compared to naive training approaches, the design of the contact-related periodic reward can capture the major characters of gaits.…”
Section: Multi-task Learning In Legged Locomotionmentioning
confidence: 99%
“…Integrating task-specific signals into the observation space is another solution for learning multiple gaits. A recent work from Siekmann et al [28] have shown bipedal controller for multiple gaits by using a cycle time with offsets and a vector ratio as gait representation and periodic reward for training. Compared to naive training approaches, the design of the contact-related periodic reward can capture the major characters of gaits.…”
Section: Multi-task Learning In Legged Locomotionmentioning
confidence: 99%
“…Siekmann et el. proposed a simple reward signal to learn several bipedal gaits considering the periodicity of gait control signal [21]. However, previous approaches are quite limited in robustness and scalability because it is based on manual data generation process and limited heuristics.…”
Section: B Gait Generationmentioning
confidence: 99%
“…Some researchers have taken advantage of motion priors in order to encode gait-specific knowledge in a reward function. One popular method for encoding such knowledge in a reward function is to maximize the similarity between the robot's motion and a reference trajectory (Peng et al 2020;Smith et al 2021). While this approach has been successfully demonstrated on real robots, it requires significant manual effort to obtain reference trajectories, and constrains the robot's motion to the given trajectory.…”
Section: Introductionmentioning
confidence: 99%