Kiru Park scite author profile

Estimating the 6D pose of objects using only RGB images remains challenging because of problems such as occlusion and symmetries. It is also difficult to construct 3D models with precise texture without expert knowledge or specialized scanning devices. To address these problems, we propose a novel pose estimation method, Pix2Pose, that predicts the 3D coordinates of each object pixel without textured models. An auto-encoder architecture is designed to estimate the 3D coordinates and expected errors per pixel. These pixel-wise predictions are then used in multiple stages to form 2D-3D correspondences to directly compute poses with the PnP algorithm with RANSAC iterations. Our method is robust to occlusion by leveraging recent achievements in generative adversarial training to precisely recover occluded parts. Furthermore, a novel loss function, the transformer loss, is proposed to handle symmetric objects by guiding predictions to the closest symmetric pose. Evaluations on three different benchmark datasets containing symmetric and occluded objects show our method outperforms the state of the art using only RGB images.

show abstract

Multi-Task Template Matching for Object Detection, Segmentation and Pose Estimation Using Depth Images

Park¹,

Patten²,

Prankl³

et al. 2019

View full text Add to dashboard Cite

DGCM-Net: Dense Geometrical Correspondence Matching Network for Incremental Experience-Based Robotic Grasping

2020

View full text Add to dashboard Cite

This article presents a method for grasping novel objects by learning from experience. Successful attempts are remembered and then used to guide future grasps such that more reliable grasping is achieved over time. To transfer the learned experience to unseen objects, we introduce the dense geometric correspondence matching network (DGCM-Net). This applies metric learning to encode objects with similar geometry nearby in feature space. Retrieving relevant experience for an unseen object is thus a nearest neighbor search with the encoded feature maps. DGCM-Net also reconstructs 3D-3D correspondences using the view-dependent normalized object coordinate space to transform grasp configurations from retrieved samples to unseen objects. In comparison to baseline methods, our approach achieves an equivalent grasp success rate. However, the baselines are significantly improved when fusing the knowledge from experience with their grasp proposal strategy. Offline experiments with a grasping dataset highlight the capability to transfer grasps to new instances as well as to improve success rate over time from increasing experience. Lastly, by learning task-relevant grasps, our approach can prioritize grasp configurations that enable the functional use of objects.

show abstract

Unsupervised Domain Adaptation Through Inter-Modal Rotation for RGB-D Object Recognition

Loghmani¹,

Robbiano

Park³

et al. 2020

IEEE Robot. Autom. Lett.

View full text Add to dashboard Cite

Neural Object Learning for 6D Pose Estimation Using a Few Cluttered Images

Park

Patten

Vincze

2020

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kiru Park

Pix2Pose: Pixel-Wise Coordinate Regression of Objects for 6D Pose Estimation

Multi-Task Template Matching for Object Detection, Segmentation and Pose Estimation Using Depth Images

DGCM-Net: Dense Geometrical Correspondence Matching Network for Incremental Experience-Based Robotic Grasping

Unsupervised Domain Adaptation Through Inter-Modal Rotation for RGB-D Object Recognition

Neural Object Learning for 6D Pose Estimation Using a Few Cluttered Images

Contact Info

Product

Resources

About