2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022
DOI: 10.1109/cvpr52688.2022.00826
|View full text |Cite
|
Sign up to set email alerts
|

VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial Attention

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
12
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 64 publications
(15 citation statements)
references
References 15 publications
0
12
0
Order By: Relevance
“…Note that both FastPillars and PillarNet make use of the same training settings in the experiment. Particularly, compared with VISTA [7], the mAP/NDS of our FastPillars-s/FastPillars-m outperforms it 1.6%/2.8% and 0.3%/1.2% respectively. The performance of FastPillarss and FastPillars-m could be further boosted by a substantial margin 1.4%/1.0% in mAP/NDS (from 64.6% to 66.0%/70.1% to 71.1% ) and 1.0%/0.8% (from 65.8% to 66.8%/71.0% to 71.8%) with the fade strategy, respectively.…”
Section: Methodsmentioning
confidence: 86%
“…Note that both FastPillars and PillarNet make use of the same training settings in the experiment. Particularly, compared with VISTA [7], the mAP/NDS of our FastPillars-s/FastPillars-m outperforms it 1.6%/2.8% and 0.3%/1.2% respectively. The performance of FastPillarss and FastPillars-m could be further boosted by a substantial margin 1.4%/1.0% in mAP/NDS (from 64.6% to 66.0%/70.1% to 71.1% ) and 1.0%/0.8% (from 65.8% to 66.8%/71.0% to 71.8%) with the fade strategy, respectively.…”
Section: Methodsmentioning
confidence: 86%
“…The voxel-based approaches [1], [2], [7], [13], [15], [16], [20], [21] initially split point clouds into 3D voxels through voxelization, owing to the unstructured nature of point clouds and the varying point density. To be more specific, the voxelization process is a maximal or average pooling operation of the points in each voxel, thus producing voxel features.…”
Section: A Voxel-based Methodsmentioning
confidence: 99%
“…To achieve efficiency, most current advanced approaches propose to utilize the gridbased representation and execute 3D object detection in the BEV space. Grid-based methods necessitate a voxelization process to transform unstructured point clouds into 3D voxels or 2D pillars, thus can be separated into voxel-based [1], [2], [7], [13]- [16], [20], [21], [23] and pillar-based [3], [8]- [10], [18], [22] approaches. After voxelization, the voxelbased approaches employ a 3D voxel backbone to encode voxel features, which are then flattened along the vertical axis to acquire 2D BEV features.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…6D object pose estimation is a fundamental task of 3D semantic analysis with many real-world applications, such as robotic grasping [7,44], augmented reality [27], and autonomous driving [8,9,21,42]. Non-linearity of the rotation space of SO(3) makes it hard to handle this nontrivial task through direct pose regression from object observations [6, 11, 15, 18, 24-26, 39, 45, 47].…”
Section: Introductionmentioning
confidence: 99%