Learning from Demonstration (Programming by Demonstration)

Calinon, Sylvain

doi:10.1007/978-3-642-41610-1_27-1

Cited by 45 publications

(29 citation statements)

References 43 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Although our platform supports controlling both arms and the head, for simplicity we only subjected the right arm to control and froze all other joints except when resetting to initial states. 2 During execution, the policy π θ generates the control u t = π θ (o t ) given current observation o t . Observations and controls are both collected at 10 Hz.…”

Section: A Neural Network Control Policiesmentioning

confidence: 99%

See 1 more Smart Citation

Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation

Zhang

McCarthy

Jowl

et al. 2018

2018 IEEE International Conference on Robotics and Automation (ICRA)

471

310

View full text Add to dashboard Cite

Imitation learning is a powerful paradigm for robot skill acquisition. However, obtaining demonstrations suitable for learning a policy that maps from raw pixels to actions can be challenging. In this paper we describe how consumergrade Virtual Reality headsets and hand tracking hardware can be used to naturally teleoperate robots to perform complex tasks. We also describe how imitation learning can learn deep neural network policies (mapping from pixels to actions) that can acquire the demonstrated skills. Our experiments showcase the effectiveness of our approach for learning visuomotor skills.

show abstract

Section: A Neural Network Control Policiesmentioning

confidence: 99%

“…Imitation learning is a class of methods for acquiring skills by observing demonstrations (see, e.g., [1], [2], [3] for surveys). It has been applied successfully to a wide range of domains in robotics, for example to autonomous driving [4], [5], [6], autonomous helicopter flight [7], gesturing [8], and manipulation [9], [10].…”

Section: Introductionmentioning

confidence: 99%

Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation

Zhang

McCarthy

Jowl

et al. 2018

2018 IEEE International Conference on Robotics and Automation (ICRA)

471

310

View full text Add to dashboard Cite

show abstract

“…D ← D ∪ GetDemonstrations(d) 4: θ * ← arg min θ L(D, θ) #relearn 5: t ← 0 # re-start episode in current context 6: Ω = AverageUncertainty() # adapt query threshold 7: a t , σ t =π θ * (x t ) # re-select action 8 Fig. 1: The controller has to perform well on all tasks it faces sequentially with limited requests for task-specific demonstrations.…”

Section: Algorithm 1 Select Action and Train If Necessarymentioning

confidence: 99%

Uncertainty Aware Learning from Demonstrations in Multiple Contexts using Bayesian Neural Networks

Thakur

Hoof

Higuera

et al. 2019

2019 International Conference on Robotics and Automation (ICRA)

View full text Add to dashboard Cite

Diversity of environments is a key challenge that causes learned robotic controllers to fail due to the discrepancies between the training and evaluation conditions. Training from demonstrations in various conditions can mitigate-but not completely prevent-such failures. Learned controllers such as neural networks typically do not have a notion of uncertainty that allows to diagnose an offset between training and testing conditions, and potentially intervene. In this work, we propose to use Bayesian Neural Networks, which have such a notion of uncertainty. We show that uncertainty can be leveraged to consistently detect situations in high-dimensional simulated and real robotic domains in which the performance of the learned controller would be sub-par. Also, we show that such an uncertainty based solution allows making an informed decision about when to invoke a fallback strategy. One fallback strategy is to request more data. We empirically show that providing data only when requested results in increased data-efficiency.

show abstract

“…Prior work in automation has explored learning from demonstrations for highly unstructured tasks such as grasping in clutter, scooping, and pipetting [16], [19]. Past work has also addressed the specific problem of learning from demonstrations under constraints [4], [5]. A popular method for dealing with unknown constraints is to identify essential components of multiple successful trajectories based on variances in the corresponding states and then to produce a learned policy that also exhibits those components [6].…”

Section: Related Work Learning From Demonstrations In Automation mentioning

confidence: 99%

Constraint Estimation and Derivative-Free Recovery for Robot Learning from Demonstrations

Laskey

Fox

Goldberg

2018

2018 IEEE 14th International Conference on Automation Science and Engineering (CASE)

View full text Add to dashboard Cite

Learning from human demonstrations can facilitate automation but is risky because the execution of the learned policy might lead to collisions and other failures. Adding explicit constraints to avoid unsafe states is generally not possible when the state representations are complex. Furthermore, enforcing these constraints during execution of the learned policy can be challenging in environments where dynamics are difficult to model such as push mechanics in grasping. In this paper, we propose Derivative-Free Recovery (DFR), a two-phase method for generating robust policies from demonstrations in robotic manipulation tasks where the system comes to rest at each time step. In the first phase, we use support estimation of supervisor demonstrations and treat the support as implicit constraints on states. We also propose a time-varying modification for sequential tasks. In the second phase, we use this support estimate to derive a switching policy that employs the learned policy in the interior of the support and switches to a recovery policy to steer the robot away from the boundary of the support if it drifts too close. We present additional conditions, which linearly bound the difference in state at each time step by the magnitude of control, allowing us to prove that the robot will not violate the constraints using the recovery policy. A simulated pushing task in MuJoCo suggests that DFR can reduce collisions by 83%. On a physical line tracking task using a da Vinci Surgical Robot and a moving Stewart platform, DFR reduced collisions by 84%.

show abstract

Learning from Demonstration (Programming by Demonstration)

Cited by 45 publications

References 43 publications

Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation

Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation

Uncertainty Aware Learning from Demonstrations in Multiple Contexts using Bayesian Neural Networks

Constraint Estimation and Derivative-Free Recovery for Robot Learning from Demonstrations

Contact Info

Product

Resources

About