2021
DOI: 10.48550/arxiv.2105.08328
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Blind Bipedal Stair Traversal via Sim-to-Real Reinforcement Learning

Abstract: Accurate and precise terrain estimation is a difficult problem for robot locomotion in real-world environments. Thus, it is useful to have systems that do not depend on accurate estimation to the point of fragility. In this paper, we explore the limits of such an approach by investigating the problem of traversing stair-like terrain without any external perception or terrain models on a bipedal robot. For such blind bipedal platforms, the problem appears difficult (even for humans) due to the surprise elevatio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
24
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 15 publications
(26 citation statements)
references
References 7 publications
0
24
0
Order By: Relevance
“…2) uses the same graph morphologies as Multi-Ped, but instead of learning locomotion strategies on flat terrain, the agent has to navigate a stairs-like pattern of blocks. The state space does not contain any information about the current positions or heights of blocks making the agent navigate the terrain blindly (similarly to [46]). The rewards and design parameters are the same as the Multi-Ped environment.…”
Section: Methodsmentioning
confidence: 99%
“…2) uses the same graph morphologies as Multi-Ped, but instead of learning locomotion strategies on flat terrain, the agent has to navigate a stairs-like pattern of blocks. The state space does not contain any information about the current positions or heights of blocks making the agent navigate the terrain blindly (similarly to [46]). The rewards and design parameters are the same as the Multi-Ped environment.…”
Section: Methodsmentioning
confidence: 99%
“…[10,11,12]), the majority of works with real robots have focused on approaches that learn locomotion skills in simulation, and then transfer the resulting con-trollers to the hardware [13,14,15,16]. Examples include the traversal of rough terrain with a quadruped [17,18] and robust walking and stair climbing with a biped [19,20,21]. With a sufficiently accurate simulation model (for instance via learned actuator models [13]) and policies that are robust to or can adapt to distribution shift (e.g.…”
Section: Introductionmentioning
confidence: 99%
“…While recent hardware and algorithmic developments have led to impressive legged locomotion demonstrations of both learned controllers [1], [2], model-based planning approaches [3], [4] and combinations of the two [5], [6], there appears to be a trade-off between the overall agility of the systems and the ability to traverse complex terrains that require precise foot placements. Broadly speaking, methods that involve planning tend to struggle more with dynamic movements while learned controllers struggle more with producing precise coordinated foot placements and desirable gaits.…”
Section: Introductionmentioning
confidence: 99%
“…Reinforcement learning (RL) methods generate closedloop control policies which can produce very dynamic and robust gaits that have been successfully transferred to hardware [2], [7]. However, it often requires significant tuning and the use of many shaping rewards to produce desirable gaits.…”
Section: Introductionmentioning
confidence: 99%