2019
DOI: 10.1109/tvcg.2018.2866793
|View full text |Cite
|
Sign up to set email alerts
|

VERAM: View-Enhanced Recurrent Attention Model for 3D Shape Classification

Abstract: Multi-view deep neural network is perhaps the most successful approach in 3D shape classification. However, the fusion of multi-view features based on max or average pooling lacks a view selection mechanism, limiting its application in, e.g., multi-view active object recognition by a robot. This paper presents VERAM, a recurrent attention model capable of actively selecting a sequence of views for highly accurate 3D shape classification. VERAM addresses an important issue commonly found in existing attention-b… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
22
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 66 publications
(24 citation statements)
references
References 44 publications
2
22
0
Order By: Relevance
“…Different from their method, our attention model is trained offline hence no online sampling is required, making it efficient for online active recognition. The works of Jayaraman and Grauman [11], Xu et al [32] and Chen et al [5] are the most related to ours. Compared to MV-RNN [32], VERAM [5] explicitly integrates view confidence and view location constrains into reward function, and deploy some strategies to do view-enhancement.…”
Section: Related Worksupporting
confidence: 73%
See 2 more Smart Citations
“…Different from their method, our attention model is trained offline hence no online sampling is required, making it efficient for online active recognition. The works of Jayaraman and Grauman [11], Xu et al [32] and Chen et al [5] are the most related to ours. Compared to MV-RNN [32], VERAM [5] explicitly integrates view confidence and view location constrains into reward function, and deploy some strategies to do view-enhancement.…”
Section: Related Worksupporting
confidence: 73%
“…We also notice that VERAM [5] achieves recognition accuracy of 92.1% on ModelNet40 with 9 views of gray images. They align all shapes and render 144 gray images for each shape with Phong reflection model.…”
Section: Results and Evaluationmentioning
confidence: 71%
See 1 more Smart Citation
“…Due to the appearance of large labeled 3D shape repository, such as ModelNet [WSK∗15] and ShapeNet [CFG∗15], supervised deep learning becomes the mainstreaming technology for 3D shape feature learning. With the class labels, informative features can be produced by constructing the mapping between the class labels and different 3D raw representations, such as voxels [WSK∗15], meshes [FFY∗19], points clouds [QSMG16, QYSG17, LFXP19, WSL∗19] and views [SMKLM15, CZZ∗18]. Compared with the unsupervised methods, these methods achieve high classification accuracy.…”
Section: Related Workmentioning
confidence: 99%
“…This holds for both aligned and unaligned 3D shapes. However, RNN based methods may be sensitive to the predefined viewpoints and thus cannot ensure the rotational invariance of the learned features, as suggested in [11].…”
Section: N-gram Learning Unitmentioning
confidence: 99%