2022
DOI: 10.48550/arxiv.2204.03039
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

DSGN++: Exploiting Visual-Spatial Relation for Stereo-based 3D Detectors

Abstract: Camera-based 3D object detectors are welcome due to their wider deployment and lower price than LiDAR sensors. We revisit the prior stereo modeling DSGN about the stereo volume constructions for representing both 3D geometry and semantics. We polish the stereo modeling and propose our approach, DSGN++, aiming for improving information flow throughout the 2D-to-3D pipeline in the following three main aspects. First, to effectively lift the 2D information to stereo volume, we propose depth-wise plane sweeping (D… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 66 publications
0
2
0
Order By: Relevance
“…Difference defined cost volume In computation, this algorithm will iterate over the D (maximum disparity) dimension. the cost volume here can be considered as the similarity between left and right cameras in disparity i, as disp i (x, y) = lef t(x, y) − right(x + i, y), like DSGN [18] and DSGN++ [17].…”
Section: Cameramentioning
confidence: 99%
See 1 more Smart Citation
“…Difference defined cost volume In computation, this algorithm will iterate over the D (maximum disparity) dimension. the cost volume here can be considered as the similarity between left and right cameras in disparity i, as disp i (x, y) = lef t(x, y) − right(x + i, y), like DSGN [18] and DSGN++ [17].…”
Section: Cameramentioning
confidence: 99%
“…DSGN++ [17] is improved from DSGN [18] This allows the voxel to aggregate differently spaced 3D structure information and further expand the 2D-to-3D information flow. To perceive accurate front-surface depths, the depth head on stereo volume in 3D space (DSV) is first transformed to the frustum space followed by front-view depth supervision.…”
Section: Cameramentioning
confidence: 99%