2021
DOI: 10.1109/tpami.2019.2952353
|View full text |Cite
|
Sign up to set email alerts
|

Assessing Transferability From Simulation to Reality for Reinforcement Learning

Abstract: Learning robot control policies from physics simulations is of great interest to the robotics community as it may render the learning process faster, cheaper, and safer by alleviating the need for expensive real-world experiments. However, the direct transfer of learned behavior from simulation to reality is a major challenge. Optimizing a policy on a slightly faulty simulator can easily lead to the maximization of the 'Simulation Optimization Bias' (SOB). In this case, the optimizer exploits modeling errors o… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
37
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 48 publications
(37 citation statements)
references
References 26 publications
0
37
0
Order By: Relevance
“…Other works approached the problem by keeping a fixed distribution over the physical parameters and, instead, relying on intelligent sampling techniques to improve generalization: [23] guides training on increasingly harder environment variations, while [24] increases the number of sampled simulated environments until satisfactory transfer behavior is reached.…”
Section: Related Workmentioning
confidence: 99%
“…Other works approached the problem by keeping a fixed distribution over the physical parameters and, instead, relying on intelligent sampling techniques to improve generalization: [23] guides training on increasingly harder environment variations, while [24] increases the number of sampled simulated environments until satisfactory transfer behavior is reached.…”
Section: Related Workmentioning
confidence: 99%
“…They found empirically that sampling domain parameters from a uniform distribution together with applying random forces and regularizing the observation space can be sufficient to cross the reality gap. Aside from to the previous methods, Muratore et al [8] introduce an approach to estimate the transferability of a policy learned from randomized physics simulations. Moreover, the authors propose a meta-algorithm which provides a probabilistic guarantee on the performance loss when transferring the policy between two domains form the same distribution.…”
Section: A Domain Randomizationmentioning
confidence: 99%
“…Learning from randomized simulations has shown to be a promising approach for learning robot control policies that transfer to the real world. Examples cover manipulation [1], [2], [3], [4], [5], trajectory optimization [6], continuous control [7], [8], [9], vision [10], [11], [12], [13], and locomotion tasks [14], [15], [16]. Independent of the task, all domain randomization methods can be classified based on the fact if they use target domain data to update the distribution over simulators or not.…”
Section: Introductionmentioning
confidence: 99%
“…In particular, collecting the amount of example trajectories required by most state-of-the-art modelfree DRL algorithms is unfeasible for current robots [4]. A common solution consists in resorting to synthetic data based on rigid-body dynamics, addressing the mismatch introduced by the sim-to-real gap in a subsequent stage [5], [6]. Nonetheless, learned behaviors often display unnatural characteristics, such as asymmetric gaits, abrupt motions of the body and limbs, or even unrealistic motions exploiting imperfections and glitches in the physical simulator of choice.…”
Section: Introductionmentioning
confidence: 99%