Sim-to-Real Transfer of Robotic Control with Dynamics Randomization

Peng, Xue Bin; Andrychowicz, Marcin; Zaremba, Wojciech; Abbeel, Pieter

doi:10.1109/icra.2018.8460528

Cited by 860 publications

(609 citation statements)

References 21 publications

Supporting

Mentioning

608

Contrasting

Unclassified

Order By: Relevance

“…In [5], the authors use a recurrent neural network to explicitly learn model parameters through real time interaction with an environment; these parameters are then used to augment the observation for a standard reinforcement learning algorithm. In [6], the authors use a recurrent policy and value function in a modified deep deterministic policy gradient algorithm to learn a policy for a robotic manipulator arm that uses real camera images as observations. In both cases, the agents train over a wide range of randomized system parameters.…”

Section: Introductionmentioning

confidence: 99%

Adaptive guidance and integrated navigation with reinforcement meta-learning

2020

View full text Add to dashboard Cite

This paper proposes a novel adaptive guidance system developed using reinforcement meta-learning with a recurrent policy and value function approximator. The use of recurrent network layers allows the deployed policy to adapt real time to environmental forces acting on the agent. We compare the performance of the DR/DV guidance law, an RL agent with a non-recurrent policy, and an RL agent with a recurrent policy in four challenging environments with unknown but highly variable dynamics. These tasks include a safe Mars landing with random engine failure and a landing on an asteroid with unknown environmental dynamics. We also demonstrate the ability of a RL meta-learning optimized policy to implement a guidance law using observations consisting of only Doppler radar altimeter readings in a Mars landing environment, and LI-DAR altimeter readings in an asteroid landing environment, thus integrating guidance and navigation.

show abstract

Section: Introductionmentioning

confidence: 99%

Adaptive guidance and integrated navigation with reinforcement meta-learning

2020

View full text Add to dashboard Cite

show abstract

“…Others perform full manipulation tasks based on multiple input modalities [1,20,31] but require a pre-specified manipulation graph [31], demonstrate only on one task [20,31], or require human demonstration and object CAD models [1]. There have been promising works that train manipulation policies in simulation and transfer them to a real robot [3,10,50]. However, only few works focused on contact-rich tasks [24] and none relied on haptic feedback in simulation, most likely because of the lack of fidelity of contact simulation and collision modeling for articulated rigid-body systems [21,25].…”

Section: A Contact-rich Manipulationmentioning

confidence: 99%

Making Sense of Vision and Touch: Learning Multimodal Representations for Contact-Rich Tasks

et al. 2020

View full text Add to dashboard Cite

Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback. It is non-trivial to manually design a robot controller that combines these modalities which have very different characteristics. While deep reinforcement learning has shown success in learning control policies for high-dimensional inputs, these algorithms are generally intractable to deploy on real robots due to sample complexity. In this work, we use self-supervision to learn a compact and multimodal representation of our sensory inputs, which can then be used to improve the sample efficiency of our policy learning. Evaluating our method on a peg insertion task, we show that it generalizes over varying geometries, configurations, and clearances, while being robust to external perturbations. We also systematically study different self-supervised learning objectives and representation learning architectures. Results are presented in simulation and on a physical robot.

show abstract

“…However, when transferred to real-world robotic systems, most of these methods become less attractive due to high sample complexity and a lack of explainability of state-of-the-art deep RL algorithms. As a consequence, the research field of domain randomization has recently been gaining interest [10,11,12,13,14,15,16,17]. This class of approaches promises to transfer control policies learned in simulation (source domain) to the real world (target domain) by randomizing the simulator's parameters (e.g., masses, extents, or friction coefficients) and hence train from a set of models instead of just one nominal model.…”

Section: Introductionmentioning

confidence: 99%

Assessing Transferability From Simulation to Reality for Reinforcement Learning

Muratore¹,

Gienger²,

Peters³

2021

IEEE Trans. Pattern Anal. Mach. Intell.

View full text Add to dashboard Cite

Learning robot control policies from physics simulations is of great interest to the robotics community as it may render the learning process faster, cheaper, and safer by alleviating the need for expensive real-world experiments. However, the direct transfer of learned behavior from simulation to reality is a major challenge. Optimizing a policy on a slightly faulty simulator can easily lead to the maximization of the 'Simulation Optimization Bias' (SOB). In this case, the optimizer exploits modeling errors of the simulator such that the resulting behavior can potentially damage the robot. We tackle this challenge by applying domain randomization, i.e., randomizing the parameters of the physics simulations during learning. We propose an algorithm called Simulation-based Policy Optimization with Transferability Assessment (SPOTA) which uses an estimator of the SOB to formulate a stopping criterion for training. The introduced estimator quantifies the over-fitting to the set of domains experienced while training. Our experimental results in two different environments show that the new simulation-based policy search algorithm is able to learn a control policy exclusively from a randomized simulator, which can be applied directly to real system without any additional training on the latter.

show abstract

Sim-to-Real Transfer of Robotic Control with Dynamics Randomization

Cited by 860 publications

References 21 publications

Adaptive guidance and integrated navigation with reinforcement meta-learning

Adaptive guidance and integrated navigation with reinforcement meta-learning

Making Sense of Vision and Touch: Learning Multimodal Representations for Contact-Rich Tasks

Assessing Transferability From Simulation to Reality for Reinforcement Learning

Contact Info

Product

Resources

About