Sehoon Ha scite author profile

Deep reinforcement learning (deep RL) holds the promise of automating the acquisition of complex controllers that can map sensory inputs directly to low-level actions. In the domain of robotic locomotion, deep RL could enable learning locomotion skills with minimal engineering and without an explicit model of the robot dynamics. Unfortunately, applying deep RL to real-world robotic tasks is exceptionally difficult, primarily due to poor sample complexity and sensitivity to hyperparameters. While hyperparameters can be easily tuned in simulated domains, tuning may be prohibitively expensive on physical systems, such as legged robots, that can be damaged through extensive trial-and-error learning. In this paper, we propose a sample-efficient deep RL algorithm based on maximum entropy RL that requires minimal per-task tuning and only a modest number of trials to learn neural network policies. We apply this method to learning walking gaits on a real-world Minitaur robot. Our method can acquire a stable gait from scratch directly in the real world in about two hours, without relying on any model or simulation, and the resulting policy is robust to moderate variations in the environment. We further show that our algorithm achieves state-of-the-art performance on simulated benchmarks with a single set of hyperparameters. Videos of training and the learned policy can be found on the project website 3 .

show abstract

DART: Dynamic Animation and Robotics Toolkit

Lee¹,

Grey²,

Ha³

et al. 2018

JOSS

226

141

View full text Add to dashboard Cite

SummaryDART (Dynamic Animation and Robotics Toolkit) is a collaborative, cross-platform, open source library created by the Graphics Lab and Humanoid Robotics Lab at Georgia Institute of Technology with ongoing contributions from the Personal Robotics Lab at University of Washington and Open Source Robotics Foundation. The library provides data structures and algorithms for kinematic and dynamic applications in robotics and computer animation. DART is distinguished by its accuracy and stability due to its use of generalized coordinates to represent articulated rigid body systems in the geometric notations (Park, Bobrow, and Ploen 1995) and Featherstone's Articulated Body Algorithm (Featherstone 2008) using a Lie group formulation to compute forward dynamics (Ploen and Park 1999) and hybrid dynamics (Sohl and Bobrow 2001). For developers, in contrast to many popular physics engines which view the simulator as a black box, DART gives full access to internal kinematic and dynamic quantities, such as the mass matrix, Coriolis and centrifugal forces, transformation matrices and their derivatives. DART also provides an efficient computation of Jacobian matrices for arbitrary body points and coordinate frames. The frame semantics of DART allows users to define arbitrary reference frames (both inertial and non-inertial) and use those frames to specify or request data. For air-tight code safety, forward kinematics and dynamics values are updated automatically through lazy evaluation, making DART suitable for real-time controllers. In addition, DART provides flexibility to extend the API for embedding user-provided classes into DART data structures. Contacts and collisions are handled using an implicit time-stepping, velocity-based LCP (linear complementarity problem) to guarantee non-penetration, directional friction, and approximated Coulomb friction cone conditions (Stewart and Trinkle 1996). DART has applications in robotics and computer animation because it features a multibody dynamic simulator and various kinematic tools for control and motion planning.

show abstract

Learning to Walk via Deep Reinforcement Learning

Haarnoja¹,

Ha²,

Zhou³

et al. 2018

Preprint

104

View full text Add to dashboard Cite

Joint Optimization of Robot Design and Motion Parameters using the Implicit Function Theorem

Coros

Alspach

et al. 2017

View full text Add to dashboard Cite

Abstract-We present a novel computational approach to optimizing the morphological design of robots. Our framework takes as input a parameterized robot design and a motion plan consisting of trajectories for end-effectors, as well as optionally, for its body. The algorithm we propose is used to optimize design parameters, namely link lengths and the placement of actuators, while concurrently adjusting motion parameters such as joint trajectories, actuator inputs, and contact forces. Our key insight is that the complex relationship between design and motion parameters can be established via sensitivity analysis if the robot's movements are modeled as spatio-temporal solutions to optimal control problems. This relationship between form and function allows us to automatically optimize robot designs based on specifications expressed as a function of range of motion or actuator forces. We evaluate our model by computationally optimizing two simulated robots that employ linear actuators: a manipulator and a large quadruped. We further validate our framework by optimizing the design of a small quadrupedal robot and testing its performance using a hardware implementation.

show abstract

Learning Fast Adaptation With Meta Strategy Optimization

Tan

Bai

et al. 2020

IEEE Robot. Autom. Lett.

View full text Add to dashboard Cite

The ability to walk in new scenarios is a key milestone on the path toward real-world applications of legged robots. In this work, we introduce Meta Strategy Optimization, a meta-learning algorithm for training policies with latent variable inputs that can quickly adapt to new scenarios with a handful of trials in the target environment. The key idea behind MSO is to expose the same adaptation process, Strategy Optimization (SO), to both the training and testing phases. This allows MSO to effectively learn locomotion skills as well as a latent space that is suitable for fast adaptation. We evaluate our method on a real quadruped robot and demonstrate successful adaptation in various scenarios, including sim-to-real transfer, walking with a weakened motor, or climbing up a slope. Furthermore, we quantitatively analyze the generalization capability of the trained policy in simulated environments. Both real and simulated experiments show that our method outperforms previous methods in adaptation to novel tasks.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Sehoon Ha

Learning to Walk Via Deep Reinforcement Learning

DART: Dynamic Animation and Robotics Toolkit

Learning to Walk via Deep Reinforcement Learning

Joint Optimization of Robot Design and Motion Parameters using the Implicit Function Theorem

Learning Fast Adaptation With Meta Strategy Optimization

Contact Info

Product

Resources

About