Object SLAM-Based Active Mapping and Robotic Grasping

Wu, Yanmin; Zhang, Yunzhou; Zhu, Delong; Chen, Xin; Coleman, Sonya; Sun, Wenkai; Hu, Xinggang; Deng, Zhiqiang

doi:10.1109/3dv53792.2021.00144

Cited by 18 publications

(11 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Active vision [15] refers to actively manipulating the camera viewpoint to obtain the maximum information for different tasks. Active vision has received a lot of attention from the robotics community and has been employed in many applications, such as robot manipulation [16], reconstruction [5], [6], [17] and SLAM [18], [19], [20].…”

Section: Active Visionmentioning

confidence: 99%

“…Both ICP and our pose refinement module are provided the same depth data from the same viewpoint(s). An object pose is considered correct if it lies within 5-mm/5-degree (5,5), or 2-mm/2-degree (2, 2), of ground truth. Fig.…”

Section: Objectsmentioning

confidence: 99%

“…Eye Bolt Tube Fitting Chrome Screw Gear Zigzag ALL (5,5) ( [8], expressed as the correct detection rate. Both ICP and our pose refinement module are provided the same depth data from the same viewpoint(s).…”

Section: Objectsmentioning

confidence: 99%

“…The remaining problems are selecting camera viewpoints to maximize information gain and utilizing the multi-view acquired depth for object pose estimation, which is crucial for fast scene understanding. Some approaches have been proposed that predict the next-best-view (NBV) to complete the depth data on the target objects and estimate/refine 6D object poses [5], [6]. These studies assume that the complete depth data is necessary for the optimal object pose estimation, and aim to find camera viewpoints that can recover as much depth data as possible on all objects.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

6D Pose Estimation for Textureless Objects on RGB Frames using Multi-View Optimization

Yang

Xue²,

Ghavidel³

et al. 2023

2023 IEEE International Conference on Robotics and Automation (ICRA)

View full text Add to dashboard Cite

6D pose estimation of textureless shiny objects has become an essential problem in many robotic applications. Many pose estimators require high-quality depth data, often measured by structured light cameras. However, when objects have shiny surfaces (e.g., metal parts), these cameras fail to sense complete depths from a single viewpoint due to the specular reflection, resulting in a significant drop in the final pose accuracy. To mitigate this issue, we present a complete active vision framework for 6D object pose refinement and next-bestview prediction. Specifically, we first develop an optimizationbased pose refinement module for the structured light camera. Our system then selects the next best camera viewpoint to collect depth measurements by minimizing the predicted uncertainty of the object pose. Compared to previous approaches, we additionally predict measurement uncertainties of future viewpoints by online rendering, which significantly improves the next-best-view prediction performance. We test our approach on the challenging real-world ROBI dataset. The results demonstrate that our pose refinement method outperforms the traditional ICP-based approach when given the same input depth data, and our next-best-view strategy can achieve high object pose accuracy with significantly fewer viewpoints than the heuristic-based policies.

show abstract

Section: Active Visionmentioning

confidence: 99%

Section: Objectsmentioning

confidence: 99%

Section: Objectsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

6D Pose Estimation for Textureless Objects on RGB Frames using Multi-View Optimization

Yang

Xue²,

Ghavidel³

et al. 2023

2023 IEEE International Conference on Robotics and Automation (ICRA)

View full text Add to dashboard Cite

show abstract

“…Previous studies [1]- [3] concentrated on providing accurate ego-motion estimation and reconstructing scene maps. However, the sparse or dense maps constructed by these methods only contain metric information, which limits their application in complex tasks [4], [5] that require scene understanding. The development of deep learning has paved the way for introducing semantic information into SLAM, and object SLAM that incorporates detection [6] or semantic segmentation [7] has attracted the interest of many researchers.…”

Section: Introductionmentioning

confidence: 99%

MMNeRF: Multi-Modal and Multi-View Optimized Cross-Scene Neural Radiance Fields

Zhang

Wang²,

Yang

et al. 2023

IEEE Access

View full text Add to dashboard Cite

We present MMNeRF, a simple yet powerful learning framework for highly photo-realistic novel view synthesis by learning Multi-modal and Multi-view features to guide neural radiance fields to a generic model. Novel view synthesis has achieved great improvement with the significant success of NeRFseries methods. However, how to make the method generic across scenes has always been a challenging task. A good idea is to introduce 2D image features as prior knowledge for adaptive modeling, yet RGB features lack geometry and 3D spatial information, which causes shape-radiance ambiguity issues and lead to blurry and low-resolution results in the synthesis images. We propose a multi-modal multi-view method to make up for the existing methods. Specifically, we introduce depth features besides RGB features into the model and effectively fuse these multi-modal features by modality-based attention. Furthermore, Our framework innovatively adopts the transformer encoder to fuse multi-view features and uses the transformer decoder to adaptively incorporate the target view with global memory. Extensive experiments are carried out on both categories-specific and category-agnostic benchmarks, and the results demonstrate that our MMNeRF achieves state-of-the-art neural rendering performance. INDEX TERMSNeural rendering, Novel view synthesis, Vision transformer, 3D implicit reconstruction Nowadays, several works (e.g., [7]-[9], [11]-[13]) are committed to the study of generality to solve the vanilla

show abstract

SQ-SLAM: Monocular Semantic SLAM Based on Superquadric Object Representation

Han,

Yang

2023

J Intell Robot Syst

View full text Add to dashboard Cite

Object SLAM-Based Active Mapping and Robotic Grasping

Cited by 18 publications

References 34 publications

6D Pose Estimation for Textureless Objects on RGB Frames using Multi-View Optimization

6D Pose Estimation for Textureless Objects on RGB Frames using Multi-View Optimization

MMNeRF: Multi-Modal and Multi-View Optimized Cross-Scene Neural Radiance Fields

SQ-SLAM: Monocular Semantic SLAM Based on Superquadric Object Representation

Contact Info

Product

Resources

About