Fast Biped Walking with a Sensor-driven Neuronal Controller and Real-time Online Learning

et al. 2015

Ann Math Artif Intell

251

208

Designing gaits and corresponding control policies is a key challenge in robot locomotion. Even with a viable controller parameterization, finding nearoptimal parameters can be daunting. Typically, this kind of parameter optimization requires specific expert knowledge and extensive robot experiments. Automatic black-box gait optimization methods greatly reduce the need for human expertise and time-consuming design processes. Many different approaches for automatic gait optimization have been suggested to date, such as grid search and evolutionary algorithms. In this article, we thoroughly discuss multiple of these optimization methods in the context of automatic gait optimization. Moreover, we extensively evaluate Bayesian optimization, a model-based approach to blackbox optimization under uncertainty, on both simulated problems and real robots. This evaluation demonstrates that Bayesian optimization is particularly suited for robotic applications, where it is crucial to find a good set of gait parameters in a small number of experiments.

Section: Optimization Methods In Roboticsmentioning

confidence: 99%

Bayesian optimization for learning gaits under uncertainty

Calandra

Seyfarth

et al. 2015

Ann Math Artif Intell

251

208

“…Due to the simplicity of this approach, such methods have been successfully applied to robotics in numerous applications [6], [9], [11], [14]. However, the straightforward application to robotics is not without peril as the generation of the ∆θ j requires proper knowledge on the system, as badly chosen ∆θ j can destabilize the policy so that the system becomes instable and the gradient estimation process is prone to fail.…”

Section: A General Approaches To Policy Gradient Estimationmentioning

confidence: 99%

“…Policy gradient methods are a notable exception to this statement. Starting with the pioneering work of Gullapali, Franklin and Benbrahim [1], [2] in the early 1990s, these methods have been applied to a variety of robot learning problems ranging from simple control tasks (e.g., balancing a ball-on a beam [3], and pole-balancing [4]) to complex learning tasks involving many degrees of freedom such as learning of complex motor skills [2], [5], [6] and locomotion [7]- [14] 1 . The advantages of policy gradient methods for robotics are numerous.…”

Section: Introductionmentioning

confidence: 99%

Policy Gradient Methods for Robotics

2006 IEEE/RSJ International Conference on Intelligent Robots and Systems

Schaal

2006

417

410

Abstract-The aquisition and improvement of motor skills and control policies for robotics from trial and error is of essential importance if robots should ever leave precisely pre-structured environments. However, to date only few existing reinforcement learning methods have been scaled into the domains of highdimensional robots such as manipulator, legged or humanoid robots. Policy gradient methods remain one of the few exceptions and have found a variety of applications. Nevertheless, the application of such methods is not without peril if done in an uninformed manner. In this paper, we give an overview on learning with policy gradient methods for robotics with a strong focus on recent advances in the field. We outline previous applications to robotics and show how the most recently developed methods can significantly improve learning performance. Finally, we evaluate our most promising algorithm in the application of hitting a baseball with an anthropomorphic arm.

“…The computation of the policy update is the key step here and a variety of updates have been proposed ranging from pairwise comparisons [Strens andMoore, 2001, Ng et al, 2004a] over gradient estimation using finite policy differences [Geng et al, 2006, Mitsunaga et al, 2005, Sato et al, 2002, Tedrake et al, 2005, and general stochastic optimization methods (such as Nelder-Mead [Bagnell and Schneider, 2001], cross entropy [Rubinstein and Kroese, 2004] and population-based methods [Goldberg, 1989]) to approaches coming from optimal control such as differential dynamic programming (DDP) [Atkeson, 1998] and multiple shooting approaches [Betts, 2001] as well as core reinforcement learning methods.…”

Section: Policy Searchmentioning

confidence: 99%

“…For rhythmic behaviors half-elliptical locuses have been used as a representation of the gait pattern of a robot dog [Kohl and Stone, 2004]. Neural Networks: Instead of analytically describing rhythmic movements, neural networks can be used as oscillators to learn gaits of a a two legged robot [Geng et al, 2006, Endo et al, 2008. Also a peg-in-hole (see Figure 2.1b) and a ball-balancing task as well as a navigation task [Hailu and Sommer, 1998] have been learned with neural networks as policy function approximators.…”

Section: Pre-structured Policiesmentioning

confidence: 99%

Learning motor skills: from algorithms to robot experiments

Kober

It - Information Technology

2014

Die Veröffentlichung steht unter folgender Creative Commons Lizenz: Namensnennung -Keine kommerzielle Nutzung -Keine Bearbeitung 2.0 Deutschland http://creativecommons.org/licenses/by-nc-nd/2.0/de/ Abstract Ever since the word "robot" was introduced to the English language by KarelČapek's play "Rossum's Universal Robots" in 1921, robots have been expected to become part of our daily lives. In recent years, robots such as autonomous vacuum cleaners, lawn mowers, and window cleaners, as well as a huge number of toys have been made commercially available. However, a lot of additional research is required to turn robots into versatile household helpers and companions. One of the many challenges is that robots are still very specialized and cannot easily adapt to changing environments and requirements. Since the 1960s, scientists attempt to provide robots with more autonomy, adaptability, and intelligence. Research in this field is still very active but has shifted focus from reasoning based methods towards statistical machine learning. Both navigation (i.e., moving in unknown or changing environments) and motor control (i.e., coordinating movements to perform skilled actions) are important sub-tasks.In this thesis, we will discuss approaches that allow robots to learn motor skills. We mainly consider tasks that need to take into account the dynamic behavior of the robot and its environment, where a kinematic movement plan is not sufficient. The presented tasks correspond to sports and games but the presented techniques will also be applicable to more mundane household tasks. Motor skills can often be represented by motor primitives. Such motor primitives encode elemental motions which can be generalized, sequenced, and combined to achieve more complex tasks. For example, a forehand and a backhand could be seen as two different motor primitives of playing table tennis. We show how motor primitives can be employed to learn motor skills on three different levels. First, we discuss how a single motor skill, represented by a motor primitive, can be learned using reinforcement learning. Second, we show how such learned motor primitives can be generalized to new situations. Finally, we present first steps towards using motor primitives in a hierarchical setting and how several motor primitives can be combined to achieve more complex tasks.To date, there have been a number of successful applications of learning motor primitives employing imitation learning. However, many interesting motor learning problems are high-dimensional reinforcement learning problems which are often beyond the reach of current reinforcement learning methods. We review research on reinforcement learning applied to robotics and point out key challenges and important strategies to render reinforcement learning tractable. Based on these insights, we introduce novel learning approaches both for single and generalized motor skills.For learning single motor skills, we study parametrized policy search methods and introduce a framework of reward-weighted imi...