Mechanical Search: Multi-Step Retrieval of a Target Object Occluded by Clutter

Danielczuk, Michael; Kurenkov, Andrey; Balakrishna, Ashwin; Matl, Matthew; Wang, David; Martí­n-Martí­n, Roberto; Garg, Animesh; Savarese, Silvio; Goldberg, Ken

doi:10.1109/icra.2019.8794143

Cited by 112 publications

(90 citation statements)

References 51 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…In this approach, the object region was estimated by the density-based spatial clustering of applications with noise (DBSCAN) algorithm, and a depth difference image (DDI) that represents the depth difference between adjacent areas is defined. Different frameworks have been presented in achieving grasping object in clutter such as active affordance exploration framework which leverages the privileges of affordance map and the active exploration [129], integrating perception, action selection, and manipulation policies to address a version of the Mechanical Search problem [130], actor model with neural network that combines Gaussian mixture and normalizing flows [131], joint learning of instance and semantic segmentation for robotic pickand-place with heavy occlusions in clutter [132], and predicting the quality and the pose of grasp using U-Grasping fully convolutional neural network(UG-Net) based on pixel-wise using depth image [133].…”

Section: B Suction and Multifunctional Graspingmentioning

confidence: 99%

Review of Deep Reinforcement Learning-Based Object Grasping: Techniques, Open Challenges, and Recommendations

2020

View full text Add to dashboard Cite

The motivation behind our work is to review and analyze the most relevant studies on deep reinforcement learning-based object manipulation. Various studies are examined through a survey of existing literature and investigation of various aspects, namely, the intended applications, techniques applied, challenges faced by researchers and recommendations for minimizing obstacles. This review refers to all relevant articles on deep reinforcement learning-based object manipulation and solutions. The object grasping issue is a major manipulation challenge. Object grasping requires detection systems, methods and tools to facilitate efficient and fast agent training. Several studies have proposed that object grasping and its subtypes are the main elements in dealing with the environment and agent. Unlike other review articles, this review article provides different observations on deep reinforcement learning-based manipulation. The results of this comprehensive review of deep reinforcement learning in the manipulation field may be valuable for researchers and practitioners because they can expedite the establishment of important guidelines.

show abstract

Section: B Suction and Multifunctional Graspingmentioning

confidence: 99%

Review of Deep Reinforcement Learning-Based Object Grasping: Techniques, Open Challenges, and Recommendations

2020

View full text Add to dashboard Cite

show abstract

“…An investigation into the multi-step retrieval of an occluded object called Mechanical Search [15] specifically states: "[The] performance gap [between our method and a human supervisor] suggests a number of open questions, such as: Can better perception algorithms improve performance? Can we formulate different sets of low level policies to increase the diversity of manipulation capability?…”

Section: Current Pose Input (V T R T )mentioning

confidence: 99%

The CoSTAR Block Stacking Dataset: Learning with Workspace Constraints

Hundt

Jain

Lin

et al. 2019

2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

View full text Add to dashboard Cite

A robot can now grasp an object more effectively than ever before, but once it has the object what happens next? We show that a mild relaxation of the task and workspace constraints implicit in existing object grasping datasets can cause neural network based grasping algorithms to fail on even a simple block stacking task when executed under more realistic circumstances.To address this, we introduce the JHU CoSTAR Block Stacking Dataset (BSD), where a robot interacts with 5.1 cm colored blocks to complete an order-fulfillment style block stacking task. It contains dynamic scenes and real time-series data in a less constrained environment than comparable datasets. There are nearly 12,000 stacking attempts and over 2 million frames of real data. We discuss the ways in which this dataset provides a valuable resource for a broad range of other topics of investigation.We find that hand-designed neural networks that work on prior datasets do not generalize to this task. Thus, to establish a baseline for this dataset, we demonstrate an automated search of neural network based models using a novel multiple-input HyperTree MetaModel, and find a final model which makes reasonable 3D pose predictions for grasping and stacking on our dataset.The CoSTAR BSD, code, and instructions are available at sites.google.com/site/costardataset.

show abstract

“…The occluding objects make it or impossible to grasp the desired object, requiring the robot to interact first with the unknown clutter to improve the target's graspability. Such situations appear frequently in domains such as home robotics or even logistic centers, and is considered an instance of the Mechanical Search problem [1].…”

Section: Introductionmentioning

confidence: 99%

“…Previous approaches proposed carefully-coded heuristics [1,2] or learned [3,4] sequences of actions that try to discover and retrieve the desired object. In both cases, the problem is simplified by choosing the action space to be a set of linear pushes parameterized as a point on the clutter and a direction to push, and a retracting motion after each action.…”

Section: Introductionmentioning

confidence: 99%

Visuomotor Mechanical Search: Learning to Retrieve Target Objects in Clutter

Kurenkov

Taglic

Kulkarni

et al. 2020

2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Self Cite

View full text Add to dashboard Cite

When searching for objects in cluttered environments, it is often necessary to perform complex interactions in order to move occluding objects out of the way and fully reveal the object of interest and make it graspable. Due to the complexity of the physics involved and the lack of accurate models of the clutter, planning and controlling precise predefined interactions with accurate outcome is extremely hard, when not impossible. In problems where accurate (forward) models are lacking, Deep Reinforcement Learning (RL) has shown to be a viable solution to map observations (e.g. images) to good interactions in the form of close-loop visuomotor policies. However, Deep RL is sample inefficient and fails when applied directly to the problem of unoccluding objects based on images. In this work we present a novel Deep RL procedure that combines i) teacheraided exploration, ii) a critic with privileged information, and iii) mid-level representations, resulting in sample efficient and effective learning for the problem of uncovering a target object occluded by a heap of unknown objects. Our experiments show that our approach trains faster and converges to more efficient uncovering solutions than baselines and ablations, and that our uncovering policies lead to an average improvement in the graspability of the target object, facilitating downstream retrieval applications.

show abstract

Mechanical Search: Multi-Step Retrieval of a Target Object Occluded by Clutter

Cited by 112 publications

References 51 publications

Review of Deep Reinforcement Learning-Based Object Grasping: Techniques, Open Challenges, and Recommendations

Review of Deep Reinforcement Learning-Based Object Grasping: Techniques, Open Challenges, and Recommendations

The CoSTAR Block Stacking Dataset: Learning with Workspace Constraints

Visuomotor Mechanical Search: Learning to Retrieve Target Objects in Clutter

Contact Info

Product

Resources

About