2022
DOI: 10.48550/arxiv.2206.09900
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Voxel-MAE: Masked Autoencoders for Pre-training Large-scale Point Clouds

Abstract: Mask-based pre-training has achieved great success for self-supervised learning in image, video, and language, without manually annotated supervision. However, it has not yet been studied about large-scale point clouds with redundant spatial information in autonomous driving. As the number of large-scale point clouds is huge, it is impossible to reconstruct the input point clouds. In this paper, we propose a mask voxel classification network for large-scale point clouds pre-training. Our key idea is to divide … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(5 citation statements)
references
References 32 publications
0
5
0
Order By: Relevance
“…In the 3D domain, methods like [28], [33] utilize local reconstruction techniques for pretraining point cloud models. The pretraining transformer networks provide a robust initialization, which is beneficial for downstream tasks.…”
Section: B Backbone Pretraining and Scalingmentioning
confidence: 99%
“…In the 3D domain, methods like [28], [33] utilize local reconstruction techniques for pretraining point cloud models. The pretraining transformer networks provide a robust initialization, which is beneficial for downstream tasks.…”
Section: B Backbone Pretraining and Scalingmentioning
confidence: 99%
“…In terms of specific task improvements, Point-M2AE [127] (2022.5) improves the learning of irregular point clouds 3D representation by modifying the encoder and decoder VOLUME 11, 2023 into a pyramid architecture, capturing fine-grained and highlevel semantic information of 3D shapes. VoxelMAE [128] (2022.6) addresses the sparse density of points and large variations in the same scene in autonomous driving point clouds. It specifically designs the discrimination between empty and non-empty points, similar to [126].…”
Section: ) Point Cloudsmentioning
confidence: 99%
“…Nevertheless, our proposed system boasts the flexibility to utilize disparate tasks and learning frameworks. With the current advancements in point-cloud pretraining [135], [136] and auto-labeling technologies [24], [137], we now have more choices for practical task learning that are independent of manual labels. In our framework, model training at the local end can be easily replaced with these models and methodologies.…”
Section: B Limitationsmentioning
confidence: 99%