2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.01214
|View full text |Cite
|
Sign up to set email alerts
|

Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction

Abstract: We present MonoPSR, a monocular 3D object detection method that leverages proposals and shape reconstruction. First, using the fundamental relations of a pinhole camera model, detections from a mature 2D object detector are used to generate a 3D proposal per object in a scene. The 3D location of these proposals prove to be quite accurate, which greatly reduces the difficulty of regressing the final 3D bounding box detection. Simultaneously, a point cloud is predicted in an object centered coordinate system to … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
131
0

Year Published

2019
2019
2020
2020

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 283 publications
(146 citation statements)
references
References 36 publications
0
131
0
Order By: Relevance
“…We compare our approach with state-of-the-art methods [2], [6], [12], [13], [15], [16], [31], which are divided into two groups depending on the input (i.e., point clouds or camera images). One group consists of MonoPSR [31] (Mono-based) and Stereo R-CNN [6] (Stereo-based) which process camera images with RGB information. The other group includes MV3D (LiDAR) [2], BirdNet [12], RT3D [13], VeloFCN [15] and LMNet [16] which are based on point clouds only.…”
Section: B Comparison With State-of-the-art Methodsmentioning
confidence: 99%
“…We compare our approach with state-of-the-art methods [2], [6], [12], [13], [15], [16], [31], which are divided into two groups depending on the input (i.e., point clouds or camera images). One group consists of MonoPSR [31] (Mono-based) and Stereo R-CNN [6] (Stereo-based) which process camera images with RGB information. The other group includes MV3D (LiDAR) [2], BirdNet [12], RT3D [13], VeloFCN [15] and LMNet [16] which are based on point clouds only.…”
Section: B Comparison With State-of-the-art Methodsmentioning
confidence: 99%
“…There are some evaluation indicators that can be used for object detection, such as Precision, Recall, F1 score, average precision (AP), and mean average precision (mAP) as expressed by Formulas (4)–(8) [108,109,110,111,112,113,114], respectively. Precision represents the proportion of all identified correct instances.…”
Section: Applications Of Point Clouds Using Deep Learningmentioning
confidence: 99%
“…Unlike the previous categories of methods, i.e., classification-based and regressionbased, this category performs the classification and regression tasks within a single architecture. The methods can firstly do the classification, the outcomes of which are cured in a regression-based refinement step [105], [84], [78], [166] or vice versa [75], or can do the classification and regression in a single-shot process [87], [145], [101], [106], [100], [148], [103], [102], [30], [37], [162].…”
Section: B Regressionmentioning
confidence: 99%
“…The regression of d is conducted by the L 2 loss, while the bin-based discrete-continuous loss is applied to firstly discretize θ y into n overlapping bins, and then to regress the angle within each bin. The input of MonoPSR [106] is an RGB image, which is not subjected to any pre-processing. Once the 2D BB proposals for the the object of interest are generated using MS-CNN [123], MonoPSR hypothesises 3D proposals, which are then fed into a CNN scoring refinement step.…”
Section: Classification and Regressionmentioning
confidence: 99%