2021
DOI: 10.1007/s11263-021-01456-w
|View full text |Cite
|
Sign up to set email alerts
|

Vote-Based 3D Object Detection with Context Modeling and SOB-3DNMS

Abstract: Most existing 3D object detection methods recognize objects individually, without giving any consideration on contextual information between these objects. However, objects in indoor scenes are usually related to each other and the scene, forming the contextual information. Based on this observation, we propose a novel 3D object detection network, which is built on the state-of-the-art VoteNet but takes into consideration of the contextual information at multiple levels for detection and recognition of 3D obje… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
4
1

Relationship

3
7

Authors

Journals

citations
Cited by 22 publications
(6 citation statements)
references
References 66 publications
0
6
0
Order By: Relevance
“…On the basis of VoteNet [109], Xie et al [49], [110], for the first time, introduce the self-attention mechanism of Transformers into the task of 3D object detection in indoor scenes. They propose the Multi-Level Context VoteNet (MLCVNet) to improve detection performance by encoding contextual information.…”
Section: D Tasksmentioning
confidence: 99%
“…On the basis of VoteNet [109], Xie et al [49], [110], for the first time, introduce the self-attention mechanism of Transformers into the task of 3D object detection in indoor scenes. They propose the Multi-Level Context VoteNet (MLCVNet) to improve detection performance by encoding contextual information.…”
Section: D Tasksmentioning
confidence: 99%
“…The environment consists of multiple sensors for identifying moving objects, static objects, and obstacles. However, the RGB-D images and LiDAR images have many noises due to environmental factors which lead to image degradation hence it needs to be removed for obtaining high-quality images and cloud points for object detection [7], [8].…”
Section: Index Terms -3d Object Detectionnoise Removalsemantic Segmen...mentioning
confidence: 99%
“…MLCVNet [30] exploits multi-level contextual information and fuses them to vote for the object center. MLCVNet++ [31] further improves the network by proposing 3DNMS to remove redundant detections during post-processing. Yan et al [32] combine texture information from RGB data and geometric information from point cloud data, and use the deep Hough voting algorithm to suggest objects.…”
Section: Related Workmentioning
confidence: 99%