The ability of robots to model their own dynamics is key to autonomous planning and learning, as well as for autonomous damage detection and recovery. Traditionally dynamic models are pre-programmed, or learned from external observations and IMU data. Here, we demonstrate for the first time how a task-agnostic dynamic self-model can be learned using only a single first-person-view camera in a self-supervised manner, without any prior knowledge of robot morphology, kinematics, or task. We trained an egocentric visual self-model using random motor babbling on a 12-DoF robot. We then show how the robot can leverage its visual self-model to achieve various locomotion tasks, such as moving forward, backward and turning, all without any additional physical training. The accuracy of the egocentric model exceeds that of a model trained using an IMU. We also show how a robot can automatically detect and recover from damage. We suggest that self-supervised egocentric visual self-modeling could allow complex systems to continuously model themselves without additional sensors and prior knowledge.
Self-modeling refers to an agent's ability to learn a predictive model of its own behavior. A continuously adapted self-model can serve as an internal simulator, enabling the agent to plan and assess various potential behaviors internally, reducing the need for expensive physical experimentation. Self-models are especially important in legged locomotion, where manual modeling is difficult, reinforcement learning is slow, and physical experimentation is risky. Here, we propose a Quasi-static Self-Modeling framework that focuses on learning a predictive model only of high-level quasi-static dynamics, rather than a continuous model. Experimental results on a 12-degree-of-freedom-legged robot demonstrate improvements over model-free and traditional model-based continuous approaches. Using 80 diverse robot morphologies, we confirm a correlation of R2=0.94 between the improvements rendered by our method and the DoF of the robot, suggesting that as future robots increase in complexity, this approach will become more valuable.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.