Most previous work on artificial curiosity (AC) and intrinsic motivation focuses on basic concepts and theory. Experimental results are generally limited to toy scenarios, such as navigation in a simulated maze, or control of a simple mechanical system with one or two degrees of freedom. To study AC in a more realistic setting, we embody a curious agent in the complex iCub humanoid robot. Our novel reinforcement learning (RL) framework consists of a state-of-the-art, low-level, reactive control layer, which controls the iCub while respecting constraints, and a high-level curious agent, which explores the iCub's state-action space through information gain maximization, learning a world model from experience, controlling the actual iCub hardware in real-time. To the best of our knowledge, this is the first ever embodied, curious agent for real-time motion planning on a humanoid. We demonstrate that it can learn compact Markov models to represent large regions of the iCub's configuration space, and that the iCub explores intelligently, showing interest in its physical constraints as well as in objects it finds in its environment.
Abstract-Humanoids have to deal with novel, unsupervised high-dimensional visual input streams. Our new method AutoIncSFA learns to compactly represent such complex sensory input sequences by very few meaningful features corresponding to high-level spatio-temporal abstractions, such as: a person is approaching me, or: an object was toppled. We explain the advantages of AutoIncSFA over previous related methods, and show that the compact codes greatly facilitate the task of a reinforcement learner driving the humanoid to actively explore its world like a playing baby, maximizing intrinsic curiosity reward signals for reaching states corresponding to previously unpredicted AutoIncSFA features.
Abstract-To plan complex motions of robots with many degrees of freedom, our novel, very flexible framework builds task-relevant roadmaps (TRMs), using a new sampling-based optimizer called Natural Gradient Inverse Kinematics (NGIK) based on natural evolution strategies (NES).To build TRMs, NGIK iteratively optimizes postures covering task-spaces expressed by arbitrary task-functions, subject to constraints expressed by arbitrary cost-functions, transparently dealing with both hard and soft constraints. TRMs are grown to maximally cover the task-space while minimizing costs. Unlike Jacobian methods, our algorithm does not rely on calculation of gradients, making application of the algorithm much simpler. We show how NGIK outperforms recent related sampling algorithms. A video demo (http://youtu.be/N6x2e1Zf_yg) successfully applies TRMs to an iCub humanoid robot with 41 DOF in its upper body, arms, hands, head, and eyes. To our knowledge, no similar methods exhibit such a degree of flexibility in defining movements.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.