Controllers in robotics often consist of expertdesigned heuristics, which can be hard to tune in higher dimensions. It is typical to use simulation to learn these parameters, but controllers learned in simulation often don't transfer to hardware. This necessitates optimization directly on hardware. However, collecting data on hardware can be expensive. This has led to a recent interest in adapting dataefficient learning techniques to robotics. One popular method is Bayesian Optimization (BO), a sample-efficient black-box optimization scheme, but its performance typically degrades in higher dimensions. We aim to overcome this problem by incorporating domain knowledge to reduce dimensionality in a meaningful way, with a focus on bipedal locomotion. In previous work, we proposed a transformation based on knowledge of human walking that projected a 16-dimensional controller to a 1-dimensional space. In simulation, this showed enhanced sample efficiency when optimizing human-inspired neuromuscular walking controllers on a humanoid model. In this paper, we present a generalized feature transform applicable to nonhumanoid robot morphologies and evaluate it on the ATRIAS bipedal robot -in simulation and on hardware. We present three different walking controllers; two are evaluated on the real robot. Our results show that this feature transform captures important aspects of walking and accelerates learning on hardware and simulation, as compared to traditional BO.
Abstract-Learning policies for bipedal locomotion can be difficult, as experiments are expensive and simulation does not usually transfer well to hardware. To counter this, we need algorithms that are sample efficient and inherently safe. Bayesian Optimization is a powerful sample-efficient tool for optimizing non-convex black-box functions. However, its performance can degrade in higher dimensions. We develop a distance metric for bipedal locomotion that enhances the sample-efficiency of Bayesian Optimization and use it to train a 16 dimensional neuromuscular model for planar walking. This distance metric reflects some basic gait features of healthy walking and helps us quickly eliminate a majority of unstable controllers. With our approach we can learn policies for walking in less than 100 trials for a range of challenging settings. In simulation, we show results on two different costs and on various terrains including rough ground and ramps, sloping upwards and downwards. We also perturb our models with unknown inertial disturbances analogous with differences between simulation and hardware. These results are promising, as they indicate that this method can potentially be used to learn control policies on hardware.
Cloth manipulation is a challenging task that, despite its importance, has received relatively little attention compared to rigid object manipulation. In this paper, we provide three benchmarks for evaluation and comparison of different approaches towards three basic tasks in cloth manipulation: spreading a tablecloth over a table, folding a towel, and dressing. The tasks can be executed on any bimanual robotic platform and the objects involved in the tasks are standardized and easy to acquire. We provide several complexity levels for each task, and describe the quality measures to evaluate task execution. Furthermore, we provide baseline solutions for all the tasks and evaluate them according to the proposed metrics.
In this work we propose an approach to learn a robust policy for solving the pivoting task. Recently, several model-free continuous control algorithms were shown to learn successful policies without prior knowledge of the dynamics of the task. However, obtaining successful policies required thousands to millions of training episodes, limiting the applicability of these approaches to real hardware. We developed a training procedure that allows us to use a simple custom simulator to learn policies robust to the mismatch of simulation vs robot. In our experiments, we demonstrate that the policy learned in the simulator is able to pivot the object to the desired target angle on the real robot. We also show generalization to an object with different inertia, shape, mass and friction properties than those used during training. This result is a step towards making model-free reinforcement learning available for solving robotics tasks via pre-training in simulators that offer only an imprecise match to the real-world dynamics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.