2021
DOI: 10.48550/arxiv.2104.13636
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Point Cloud Learning with Transformer

Abstract: Remarkable performance from Transformer networks in Natural Language Processing promote the development of these models in dealing with computer vision tasks such as image recognition and segmentation. In this paper, we introduce a novel framework, called Multi-level Multi-scale Point Transformer (MLMSPT) that works directly on the irregular point clouds for representation learning. Specifically, a point pyramid transformer is investigated to model features with diverse resolutions or scales we defined, follow… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(7 citation statements)
references
References 49 publications
0
7
0
Order By: Relevance
“…Han et. al [15] proposed another point-wise approach to learn global features. Specifically, they designed a multilevel Transformer to extract global features of target point clouds with different resolutions, followed by concatenating these features and feeding them into a multi-scale Transformer to get the final global features.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…Han et. al [15] proposed another point-wise approach to learn global features. Specifically, they designed a multilevel Transformer to extract global features of target point clouds with different resolutions, followed by concatenating these features and feeding them into a multi-scale Transformer to get the final global features.…”
Section: Related Workmentioning
confidence: 99%
“…Vector Attention. Usually, there are two kinds of selfattention operators: vector attention and scalar attention, where the later has been applied in many previous 3D Transformer works [13], [15], while the former has been proven to be more effective than other operators in the fields of 2D image [31] and 3D point cloud processing [14].…”
Section: Global Feature Learning Block (Gfl)mentioning
confidence: 99%
See 2 more Smart Citations
“…In the field of 3D point cloud processing, high-level tasks usually include: classification & segmentation [4], [7], [8], [28]- [30], [33], [35], [36], [38], [40], [41], [47], [82], [88], [89], object detection [31], [42], [43], [49]- [51], [66], [90], [91] and tracking [52]- [54], registration [55]- [59], [68], [69] and so on. Here, we start by introducing classification & segmentation tasks, which are very common and fundamental research topics in the field of 3D computer vision.…”
Section: D Tasksmentioning
confidence: 99%