2019
DOI: 10.48550/arxiv.1906.00410
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Learning Domain Randomization Distributions for Training Robust Locomotion Policies

Abstract: Domain randomization (DR) is a successful technique for learning robust policies for robot systems, when the dynamics of the target robot system are unknown. The success of policies trained with domain randomization however, is highly dependent on the correct selection of the randomization distribution. The majority of success stories typically use real world data in order to carefully select the DR distribution, or incorporate real world trajectories to better estimate appropriate randomization distributions.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 9 publications
0
3
0
Order By: Relevance
“…Parameter inference may also be integrated in a closed-loop system (Chebotar et al, 2019;Mehta, Diaz, Golemo, Pal, & Paull, 2019;Mozifian, Higuera, Meger, & Dudek, 2019;Ramos et al, 2019), where the currently estimated posterior over simulation parameters guides domain randomization and policy learning. Such approaches iteratively reduce the sim2real gap.…”
Section: Parameter Inference For Simulatorsmentioning
confidence: 99%
“…Parameter inference may also be integrated in a closed-loop system (Chebotar et al, 2019;Mehta, Diaz, Golemo, Pal, & Paull, 2019;Mozifian, Higuera, Meger, & Dudek, 2019;Ramos et al, 2019), where the currently estimated posterior over simulation parameters guides domain randomization and policy learning. Such approaches iteratively reduce the sim2real gap.…”
Section: Parameter Inference For Simulatorsmentioning
confidence: 99%
“…Parameter inference may also be integrated in a closed-loop system [11,54,62,73], where the currently estimated posterior over simulation parameters guides domain randomization and policy learning. Such approaches iteratively reduce the sim2real gap.…”
Section: Parameter Inference For Simulatorsmentioning
confidence: 99%
“…Leveraging Bayesian Optimization, Muratore et al [131] update the randomized parameter distribution to maximize policy performance in the target domain. Mozifian et al [132] takes a different perspective by updating the parameter distribution to balance conservativeness and robustness. However, the real-world applicability of this method is uncertain, as it doesn't consider real-world performance explicitly.…”
Section: Learning Robust Policy Through Domain Randomizationmentioning
confidence: 99%