2020
DOI: 10.1007/978-3-030-58555-6_3
|View full text |Cite
|
Sign up to set email alerts
|

EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection

Abstract: In this paper, we aim at addressing two critical issues in the 3D detection task, including the exploitation of multiple sensors (namely LiDAR point cloud and camera image), as well as the inconsistency between the localization and classification confidence. To this end, we propose a novel fusion module to enhance the point features with semantic image features in a point-wise manner without any image annotations. Besides, a consistency enforcing loss is employed to explicitly encourage the consistency of both… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
206
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
4
2

Relationship

0
10

Authors

Journals

citations
Cited by 327 publications
(208 citation statements)
references
References 40 publications
(71 reference statements)
2
206
0
Order By: Relevance
“…To tackle the irregular data format of point clouds, most existing works project the point clouds to regular grids to be processed by 2D or 3D CNN. The pioneer work MV3D [9] projects the point clouds to 2D bird view grids and places lots of predefined 3D anchors for generating 3D bounding boxes, and the following works [11], [13], [17], [56], [57], [58] develop better strategies for multi-sensor fusion. [12], [14], [15] introduce more efficient frameworks with bird-eye view representation while [59] proposes to fuse grid features of multiple scales.…”
Section: D Object Detection With Point Cloudsmentioning
confidence: 99%
“…To tackle the irregular data format of point clouds, most existing works project the point clouds to regular grids to be processed by 2D or 3D CNN. The pioneer work MV3D [9] projects the point clouds to 2D bird view grids and places lots of predefined 3D anchors for generating 3D bounding boxes, and the following works [11], [13], [17], [56], [57], [58] develop better strategies for multi-sensor fusion. [12], [14], [15] introduce more efficient frameworks with bird-eye view representation while [59] proposes to fuse grid features of multiple scales.…”
Section: D Object Detection With Point Cloudsmentioning
confidence: 99%
“…In the process of 3D object detection, inconsistency between the localization and classification confidence is a critical issue [75]. To solve the problem, a consistency enforcing loss is utilized to increase the consistency of both the localization and classification in EPNet [76]. Moreover, the point features is enhanced with semantic image features in a point-wise manner without image annotations.…”
Section: Multi-sensor Fusion-based 3d Object Detection Methodsmentioning
confidence: 99%
“…So, it needs the point-wise feature extraction and the cross-level fusion strategy. EPNet [199] (shown in Figure 9) introduces a novel LI-Fusion module which is a point-wise fusion manner enhancing the point clouds feature map with the image feature. The LI-Fusion includes a point-wise correspondence generation part and a LiDAR-guided fusion part.…”
Section: Cross-level Fusion Strategymentioning
confidence: 99%