Combining self-supervised learning and imitation for vision-based rope manipulation

Nair, Ashvin; Chen, Dian; Agrawal, Pulkit; Isola, Phillip; Abbeel, Pieter; Malik, Jitendra; Levine, Sergey

doi:10.1109/icra.2017.7989247

Cited by 246 publications

(251 citation statements)

References 31 publications

Supporting

Mentioning

251

Contrasting

Order By: Relevance

“…1) Static Obstacles: We begin our investigation by comparing our planning based method to a baseline of only using an inverse model without planning, as in the previous block and wall domain. We designed a rope manipulation environment similar to [35], but which also contains fixed obstacles which the rope cannot move through.…”

Section: Real Robot Rope Manipulation Domainmentioning

confidence: 99%

“…For data collection, we followed the approach in [35] for generating random pokes of the rope, and collected 2k samples for observations and actions. To increase the size of our dataset, we collected 10k additional observation samples by manually manipulating the rope (which is much faster to collect).…”

Section: Real Robot Rope Manipulation Domainmentioning

confidence: 99%

“…In particular, Nair et al [35] learned an inverse dynamics model for rope manipulation directly from raw image data collected by randomly poking the rope. This controller was used to manipulate the rope into a given shape, whereby a human would first provide a sequence of images -a visual plan -that prescribes the desired trajectory of the rope, and then the learned inverse model would compute actions that track the plan (a.k.a.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Learning Robotic Manipulation through Visual Planning and Acting

Wang¹,

Kurutach²,

Liu³

et al. 2019

Robotics: Science and Systems XV

Self Cite

View full text Add to dashboard Cite

Planning for robotic manipulation requires reasoning about the changes a robot can affect on objects. When such interactions can be modelled analytically, as in domains with rigid objects, efficient planning algorithms exist. However, in both domestic and industrial domains, the objects of interest can be soft, or deformable, and hard to model analytically. For such cases, we posit that a data-driven modelling approach is more suitable. In recent years, progress in deep generative models has produced methods that learn to 'imagine' plausible images from data. Building on the recent Causal InfoGAN generative model, in this work we learn to imagine goal-directed object manipulation directly from raw image data of self-supervised interaction of the robot with the object. After learning, given a goal observation of the system, our model can generate an imagined plan -a sequence of images that transition the object into the desired goal. To execute the plan, we use it as a reference trajectory to track with a visual servoing controller, which we also learn from the data as an inverse dynamics model. In a simulated manipulation task, we show that separating the problem into visual planning and visual tracking control is more sample efficient and more interpretable than alternative datadriven approaches. We further demonstrate our approach on learning to imagine and execute in 3 environments, the final of which is deformable rope manipulation on a PR2 robot.

show abstract

Section: Real Robot Rope Manipulation Domainmentioning

confidence: 99%

Section: Real Robot Rope Manipulation Domainmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Learning Robotic Manipulation through Visual Planning and Acting

Wang¹,

Kurutach²,

Liu³

et al. 2019

Robotics: Science and Systems XV

Self Cite

View full text Add to dashboard Cite

show abstract

“…Therefore, this model can be used to infer the missing action labels of the expert. Then, the inferred actions can be executed to reproduce the trainers states [Nair et al, 2017]. As an alternative, after inferring the actions, a mapping from states to the actions can be learned and used to improve the learned model and consequently the policy [Torabi et al, 2018a].…”

Section: Imitation From Observationmentioning

confidence: 99%

Leveraging Human Guidance for Deep Reinforcement Learning Tasks

Zhang

Torabi

Guan

et al. 2019

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence

View full text Add to dashboard Cite

Reinforcement learning agents can learn to solve sequential decision tasks by interacting with the environment. Human knowledge of how to solve these tasks can be incorporated using imitation learning, where the agent learns to imitate human demonstrated decisions. However, human guidance is not limited to the demonstrations. Other types of guidance could be more suitable for certain tasks and require less human effort. This survey provides a high-level overview of five recent learning frameworks that primarily rely on human guidance other than conventional, step-by-step action demonstrations. We review the motivation, assumption, and implementation of each framework. We then discuss possible future research directions.

show abstract

“…This FEM trained on the semantics of weather 0 is used as a teacher to train the student which is capable of producing the semantics of all the other 14 weather conditions. The authors of [9] used the method of [27] and provide 10 separate networks for translating from weather 0 to weathers 2, 3,4,6,8,9,10,11,12, and 13, respectively. The translated images for each of the 10 weather conditions along with weather 0 are fed in equal proportion to train the student.…”

Section: Modelsmentioning

confidence: 99%

Towards Generalizing Sensorimotor Control Across Weather Conditions

Khan

Wenzel

Cremers

et al. 2019

2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

View full text Add to dashboard Cite

The ability of deep learning models to generalize well across different scenarios depends primarily on the quality and quantity of annotated data. Labeling large amounts of data for all possible scenarios that a model may encounter would not be feasible; if even possible. We propose a framework to deal with limited labeled training data and demonstrate it on the application of vision-based vehicle control. We show how limited steering angle data available for only one condition can be transferred to multiple different weather scenarios. This is done by leveraging unlabeled images in a teacher-student learning paradigm complemented with an image-to-image translation network. The translation network transfers the images to a new domain, whereas the teacher provides soft supervised targets to train the student on this domain. Furthermore, we demonstrate how utilization of auxiliary networks can reduce the size of a model at inference time, without affecting the accuracy. The experiments show that our approach generalizes well across multiple different weather conditions using only ground truth labels from one domain.

show abstract

Combining self-supervised learning and imitation for vision-based rope manipulation

Cited by 246 publications

References 31 publications

Learning Robotic Manipulation through Visual Planning and Acting

Learning Robotic Manipulation through Visual Planning and Acting

Leveraging Human Guidance for Deep Reinforcement Learning Tasks

Towards Generalizing Sensorimotor Control Across Weather Conditions

Contact Info

Product

Resources

About