2023
DOI: 10.3390/s23063340
|View full text |Cite
|
Sign up to set email alerts
|

Bilateral Cross-Modal Fusion Network for Robot Grasp Detection

Abstract: In the field of vision-based robot grasping, effectively leveraging RGB and depth information to accurately determine the position and pose of a target is a critical issue. To address this challenge, we proposed a tri-stream cross-modal fusion architecture for 2-DoF visual grasp detection. This architecture facilitates the interaction of RGB and depth bilateral information and was designed to efficiently aggregate multiscale information. Our novel modal interaction module (MIM) with a spatial-wise cross-attent… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 42 publications
(88 reference statements)
0
0
0
Order By: Relevance
“…Zhang et al [59] proposed a hybrid Transformer-CNN method for 2-DoF object pose detection. They further proposed a bilateral neural network architecture [60] for RGB and depth image fusion and achieved promising results. In 6-DoF pose detection area, Wang et al [6] introduced the DenseFusion framework for precise 6-DoF pose estimation using two data sources and a dense fusion network.…”
Section: Multi-modal Data Based Object Pose Estimationmentioning
confidence: 99%
“…Zhang et al [59] proposed a hybrid Transformer-CNN method for 2-DoF object pose detection. They further proposed a bilateral neural network architecture [60] for RGB and depth image fusion and achieved promising results. In 6-DoF pose detection area, Wang et al [6] introduced the DenseFusion framework for precise 6-DoF pose estimation using two data sources and a dense fusion network.…”
Section: Multi-modal Data Based Object Pose Estimationmentioning
confidence: 99%