2018
DOI: 10.48550/arxiv.1812.11103
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Learning to Walk via Deep Reinforcement Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
116
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
5

Relationship

0
10

Authors

Journals

citations
Cited by 91 publications
(116 citation statements)
references
References 0 publications
0
116
0
Order By: Relevance
“…Every module is fed all environment information x k = (x, k) and distinctly chosen IGFs mask which part of the input each network has access to, thereby influencing which skills they learn. By presenting multiple priors, we enable a comparison with existing literature [14,19,20,21]. With the right masking, one can recover previously investigated asymmetries [14,19], explore additional ones, and also express purely hierarchical [9] and KL-regularized equivalents [8].…”
Section: Information Asymmetrymentioning
confidence: 99%
“…Every module is fed all environment information x k = (x, k) and distinctly chosen IGFs mask which part of the input each network has access to, thereby influencing which skills they learn. By presenting multiple priors, we enable a comparison with existing literature [14,19,20,21]. With the right masking, one can recover previously investigated asymmetries [14,19], explore additional ones, and also express purely hierarchical [9] and KL-regularized equivalents [8].…”
Section: Information Asymmetrymentioning
confidence: 99%
“…Learning for Legged Locomotion and Gaits: Data-driven learning for legged locomotion has shown robust controllers for quadrupedal robots [22,34,35,36,37]. Of these, [34,22] focus on a controller on complex terrains but do not synthesize and analyze leg patterns of the robot across diverse gaits at different target speeds.…”
Section: Related Workmentioning
confidence: 99%
“…We also validate the policy with successful transfer to hardware. Previously, [22] trained policies in Minitaur quadruped in 160 epochs using SAC. Due to increasing complexity, we will consider direct hardware training for quadrupeds in future.…”
Section: Hardware Training and Transfermentioning
confidence: 99%