Learning Visual Servoing with Deep Features and Fitted Q-Iteration

Lee, Alex X.; Levine, Sergey; Abbeel, Pieter

doi:10.48550/arxiv.1703.11000

Cited by 10 publications

(13 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The first category of tasks is ones where the goal location is known, and limited exploration is necessary. This could be in the form of a simply wandering around without colliding [15,34], following an object [23], getting to a goal coordinate [1,17]: using sequence of images along the path [5,22], or language-based instructions [2]. Sometimes, the goal is specified as an image but experience from the environment is available in the form of demonstrations [13,35], or in the form of reward-based training [25,48], which again limits the role of exploration.…”

Section: Related Workmentioning

confidence: 99%

“…Depending on the problem being considered, different representations have been investigated. For short-range locomotion tasks, purely reactive policies [3,15,23,34] suffice. For more complex problems such as target-driven navigation in a novel environment, such purely reactive strategies do not work well [48], and memory-based policies have been investigated.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Neural Topological SLAM for Visual Navigation

Chaplot

Salakhutdinov

Gupta

et al. 2020

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

160

159

View full text Add to dashboard Cite

This paper studies the problem of image-goal navigation which involves navigating to the location indicated by a goal image in a novel previously unseen environment. To tackle this problem, we design topological representations for space that effectively leverage semantics and afford approximate geometric reasoning. At the heart of our representations are nodes with associated semantic features, that are interconnected using coarse geometric information. We describe supervised learning-based algorithms that can build, maintain and use such representations under noisy actuation. Experimental study in visually and physically realistic simulation suggests that our method builds effective representations that capture structural regularities and efficiently solve long-horizon navigation problems. We observe a relative improvement of more than 50% over existing methods that study this task.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Neural Topological SLAM for Visual Navigation

Chaplot

Salakhutdinov

Gupta

et al. 2020

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

160

159

View full text Add to dashboard Cite

show abstract

“…[10], instead, manages to train from single view image streams a neural network able to predict the probability of successful grasps, learning thus a hand-eye coordination for grasping. Interesting related works on visual DRL for robotics are also [11], [12], [13], [14]. Data efficient DRL for DPG-based dexterous manipulation has been further explored in [15], which mainly focuses on stacking Lego blocks.…”

Section: State Of the Artmentioning

confidence: 99%

Robotic Arm Control and Task Training through Deep Reinforcement Learning

Franceschetti¹,

Tosello²,

Castaman³

et al. 2020

Preprint

View full text Add to dashboard Cite

This paper proposes a detailed and extensive comparison of the Trust Region Policy Optimization and Deep Q-Network with Normalized Advantage Functions with respect to other state of the art algorithms, namely Deep Deterministic Policy Gradient and Vanilla Policy Gradient. Comparisons demonstrate that the former have better performances then the latter when asking robotic arms to accomplish manipulation tasks such as reaching a random target pose and pick & placing an object. Both simulated and real-world experiments are provided. Simulation lets us show the procedures that we adopted to precisely estimate the algorithms hyper-parameters and to correctly design good policies. Real-world experiments let show that our polices, if correctly trained on simulation, can be transferred and executed in a real environment with almost no changes.

show abstract

“…Visual servoing: Finally, there have been multiple approaches to visual servoing over the years [1], [17], [18], including some newer methods that use deep learned features and reinforcement learning [19]. While these methods depend on an external system for data association or on pre-specified features, our system is trained end-to-end and can control directly from raw depth data.…”

Section: Related Workmentioning

confidence: 99%

SE3-Pose-Nets: Structured Deep Dynamics Models for Visuomotor Control

Byravan

Lceb

Meier

et al. 2018

2018 IEEE International Conference on Robotics and Automation (ICRA)

View full text Add to dashboard Cite

In this work, we present an approach to deep visuomotor control using structured deep dynamics models. Our deep dynamics model, a variant of SE3-Nets, learns a low-dimensional pose embedding for visuomotor control via an encoder-decoder structure. Unlike prior work, our dynamics model is structured: given an input scene, our network explicitly learns to segment salient parts and predict their poseembedding along with their motion modeled as a change in the pose space due to the applied actions. We train our model using a pair of point clouds separated by an action and show that given supervision only in the form of point-wise data associations between the frames our network is able to learn a meaningful segmentation of the scene along with consistent poses. We further show that our model can be used for closedloop control directly in the learned low-dimensional pose space, where the actions are computed by minimizing error in the pose space using gradient-based methods, similar to traditional model-based control. We present results on controlling a Baxter robot from raw depth data in simulation and in the real world and compare against two baseline deep networks. Our method runs in real-time, achieves good prediction of scene dynamics and outperforms the baseline methods on multiple control runs. Video results can be found at: https://rse-lab.cs. washington.edu/se3-structured-deep-ctrl/

show abstract

Learning Visual Servoing with Deep Features and Fitted Q-Iteration

Cited by 10 publications

References 19 publications

Neural Topological SLAM for Visual Navigation

Neural Topological SLAM for Visual Navigation

Robotic Arm Control and Task Training through Deep Reinforcement Learning

SE3-Pose-Nets: Structured Deep Dynamics Models for Visuomotor Control

Contact Info

Product

Resources

About