2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.00864
|View full text |Cite
|
Sign up to set email alerts
|

Pseudo-LiDAR From Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving

Abstract: 3D object detection is an essential task in autonomous driving. Recent techniques excel with highly accurate detection rates, provided the 3D input data is obtained from precise but expensive LiDAR technology. Approaches based on cheaper monocular or stereo imagery data have, until now, resulted in drastically lower accuracies -a gap that is commonly attributed to poor image-based depth estimation. However, in this paper we argue that it is not the quality of the data but its representation that accounts for t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
770
1

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 966 publications
(772 citation statements)
references
References 36 publications
(120 reference statements)
1
770
1
Order By: Relevance
“…Furthermore, additional work is required for bridging the gap between image-and LiDAR-based 3D perception (Wang et al, 2019), enabling the computer vision community to close the current debate on camera versus LiDAR as main perception sensors.…”
Section: Discussionmentioning
confidence: 99%
“…Furthermore, additional work is required for bridging the gap between image-and LiDAR-based 3D perception (Wang et al, 2019), enabling the computer vision community to close the current debate on camera versus LiDAR as main perception sensors.…”
Section: Discussionmentioning
confidence: 99%
“…6D object pose estimators [40], [38], [163], [167], [168], [159], [160], [31], [32], [4], [35], [161] extract features from the input images, and using the trained regressor, estimate objects' 6D pose. Several methods further refine the output of the trained regressors [101], [83], [79], [82], [108], [40], [38], [163], [167], [168], [159], [160], [31], [32], [4], [35], [161] (refinement block), and finally hypothesise the object pose after filtering. Table III elaborates the regression-based methods.…”
Section: A Classificationmentioning
confidence: 99%
“…Unlike the previous categories of methods, i.e., classification-based and regressionbased, this category performs the classification and regression tasks within a single architecture. The methods can firstly do the classification, the outcomes of which are cured in a regression-based refinement step [105], [84], [78], [166] or vice versa [75], or can do the classification and regression in a single-shot process [87], [145], [101], [106], [100], [148], [103], [102], [30], [37], [162].…”
Section: B Regressionmentioning
confidence: 99%
“…, where n is the number of points. The point cloud obtained from the intermediate depth map is named as Pseudo-LiDAR [22].…”
Section: B Transformation Modulementioning
confidence: 99%
“…Therefore, previous researches [3], [9], [19] prefer to achieve different tasks on 2D depth maps or other projected views. Moreover, the target point cloud can be constructed in the form of Pseudo-LiDAR [22] using known camera intrinsics. For Pseudo-LiDAR interpolation, an intermediate depth map is first generated and then back-projected into the 3D space.…”
Section: Introductionmentioning
confidence: 99%