Robotics: Science and Systems XVIII 2022
DOI: 10.15607/rss.2022.xviii.022
|View full text |Cite
|
Sign up to set email alerts
|

Rapid Locomotion via Reinforcement Learning

Abstract: Agile maneuvers such as sprinting and high-speed turning in the wild are challenging for legged robots. We present an end-to-end learned controller that achieves record agility for the MIT Mini Cheetah, sustaining speeds up to 3.9 m/s. This system runs and turns fast on natural terrains like grass, ice, and gravel and responds robustly to disturbances. Our controller is a neural network trained in simulation via reinforcement learning and transferred to the real world. The two key components are (i) an adaptiv… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
32
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 75 publications
(32 citation statements)
references
References 1 publication
0
32
0
Order By: Relevance
“…2) Model-free RL for legged locomotion control: Recent years have seen exciting progress on using deep RL to learn locomotion controllers for quadrupedal robots [38]- [41] and bipedal robots [42]- [47] in the real world. Since it is challenging in general to learn a single policy with RL to perform various tasks [48], many prior works focus on learning a single-task policy [16], [17], [49], [50] for legged robots, such as just forward walking [39], [51], [52]. There have been efforts to obtain a multi-task policy, such as walking at different velocities using different gaits, conditioned only on variable commands [46], [53]- [55], which requires more extensive tuning due to the lack of a gait prior.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…2) Model-free RL for legged locomotion control: Recent years have seen exciting progress on using deep RL to learn locomotion controllers for quadrupedal robots [38]- [41] and bipedal robots [42]- [47] in the real world. Since it is challenging in general to learn a single policy with RL to perform various tasks [48], many prior works focus on learning a single-task policy [16], [17], [49], [50] for legged robots, such as just forward walking [39], [51], [52]. There have been efforts to obtain a multi-task policy, such as walking at different velocities using different gaits, conditioned only on variable commands [46], [53]- [55], which requires more extensive tuning due to the lack of a gait prior.…”
Section: Related Workmentioning
confidence: 99%
“…Since performing rollout on the hardware of human-scale bipedal robots is expensive, we use the zero-shot transfer method. In order to realize this, there are two widely-adopted techniques: (i) end-to-end training a policy by providing the robot with a proprioceptive short-term history [39], [45], [57] or longterm history [44], [62], [68], (ii) teacher-student training that first obtains a teacher policy with privileged information of the environment by RL, then uses this policy to supervise the training of a student policy that only has access of onboardavailable observations [18], [38], [40], [42], [52], [55], which shows advantages over the end-to-end training method [38], [52], [70]. However, here we show that, for the dynamic control of bipedal robots, by training the robot in an endto-end way with a newly-proposed policy structure, we can realize a better learning performance over the teacher-student method which separates the training process and requires more data.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Empirical results show that as discrepancies between the training and deployment environments become more intense, invariance through latent alignmenthas a large competitive edge over alternatives such as data augmentation techniques. The problem of test-time adaptation in visual reinforcement learning using unsupervised test-time trajectories is relatively new, but has thus far shown great relevance and promise in robotics [11,13], where a sim-to-real pipeline has been at the fore-front of recent progress [20,28,41].…”
Section: Closing Remarksmentioning
confidence: 99%
“…Reinforcement learning for control has achieved great success in a wide variety of challenging sensory-motor control tasks, including agile drone flight [20,21,26], deformable object manipulation [41], and quadruped locomotion [19,24,28,30]. In comparison to their classical model-predictive control counterparts, reinforcement learning-based approaches enables the use of more realistic forward dynamics model in the form of a physics simulator.…”
Section: Introductionmentioning
confidence: 99%