2022
DOI: 10.1109/lra.2022.3143198
|View full text |Cite
|
Sign up to set email alerts
|

Hierarchical Policies for Cluttered-Scene Grasping With Latent Plans

Abstract: 6D grasping in cluttered scenes is a longstanding robotic manipulation problem. Open-loop manipulation pipelines can fail due to modularity and error sensitivity while most end-to-end grasping policies with raw perception inputs have not yet scaled to complex scenes with obstacles. In this work, we propose a new method to close the gap through sampling and selecting plans in the latent space. Our hierarchical framework learns collision-free target-driven grasping based on partial point cloud observations. Our … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 20 publications
(8 citation statements)
references
References 45 publications
0
4
0
Order By: Relevance
“…Hierarchical RL [81], [139] [66], [71], [94], [95], [103], [129] Perfect simulator [40], [77], [90], [109] [42], [46], [137], [141] Domain randomization [84], [98], [108], [124] [91], [102], [123], [128] Domain adaptation [17], [43], [52], [64], [75], [104], [110], [143] adversarial network (GAN) capable of translating real tactile images to simulated depth images. Ning et al [97] introduce an autonomous robotic ultrasound imaging system, where the observation is a concatenated latent vector of two conventional autoencoders.…”
Section: Guided Rl Methods Sourcementioning
confidence: 99%
See 1 more Smart Citation
“…Hierarchical RL [81], [139] [66], [71], [94], [95], [103], [129] Perfect simulator [40], [77], [90], [109] [42], [46], [137], [141] Domain randomization [84], [98], [108], [124] [91], [102], [123], [128] Domain adaptation [17], [43], [52], [64], [75], [104], [110], [143] adversarial network (GAN) capable of translating real tactile images to simulated depth images. Ning et al [97] introduce an autonomous robotic ultrasound imaging system, where the observation is a concatenated latent vector of two conventional autoencoders.…”
Section: Guided Rl Methods Sourcementioning
confidence: 99%
“…Nachum et al [94] employ a hierarchy to learn low-level goal reaching skills coordinated by a highlevel controller for coordinated multiagent object manipulation. Wang et al [129] apply a hierarchical policy in a cluttered-scene grasping cluttered scene grasping setting that learns an embedding space on expert plans and chooses sampled plans via a critic as well as appropriate options [122] via an option classifier. Finally, Li et al [71] adopt a hierarchical structure for interactive navigation tasks, where a high-level policy generates subgoals and selects low-level policies returning task phase-specific robot actions.…”
Section: Hierarchical Rlmentioning
confidence: 99%
“…Initially designed for indoor human-computer interaction, it has been successfully applied in various automation scenarios. As described in [32], algorithms for contour and spatial positioning of planar shapes can be detected using Kinect. Figure 3 shows RGB and depth images captured at 640 × 480 px resolution in our work, used for extracting features such as obstacles, current location, and target positioning.…”
Section: End-to-end Ddpgmentioning
confidence: 99%
“…closing a drawer quickly or slowly. We address this inherent multi-modality by auto-encoding contextual data through a latent plan space with a sequence-to-sequence conditional variational auto-encoder (seq2seq CVAE) [13,45]. Conditioning the action decoder on the latent plan allows the policy to use the entirety of its capacity for learning uni-modal behavior.…”
Section: Learning the Low-level Policymentioning
confidence: 99%