2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.00371
|View full text |Cite
|
Sign up to set email alerts
|

Actional-Structural Graph Convolutional Networks for Skeleton-Based Action Recognition

Abstract: Action recognition with skeleton data has recently attracted much attention in computer vision. Previous studies are mostly based on fixed skeleton graphs, only capturing local physical dependencies among joints, which may miss implicit joint correlations. To capture richer dependencies, we introduce an encoder-decoder structure, called A-link inference module, to capture action-specific latent dependencies, i.e. actional links, directly from actions. We also extend the existing skeleton graphs to represent hi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
672
0
1

Year Published

2020
2020
2023
2023

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 1,018 publications
(673 citation statements)
references
References 20 publications
0
672
0
1
Order By: Relevance
“…The results of the comparison are shown in Tables 5 and 6. The methods used for comparison include the handcraft-feature-based methods [33], RNN-based methods [28,29,34,35], CNN-based methods [36,37], and GCN-based methods [6][7][8][9][10]. From Table 5, we can see that our proposed method achieves the best performances of 96.8% and 91.7% in terms of two criteria on the NTU-RGBD dataset.…”
Section: Comparison With the State-of-the-artmentioning
confidence: 99%
See 2 more Smart Citations
“…The results of the comparison are shown in Tables 5 and 6. The methods used for comparison include the handcraft-feature-based methods [33], RNN-based methods [28,29,34,35], CNN-based methods [36,37], and GCN-based methods [6][7][8][9][10]. From Table 5, we can see that our proposed method achieves the best performances of 96.8% and 91.7% in terms of two criteria on the NTU-RGBD dataset.…”
Section: Comparison With the State-of-the-artmentioning
confidence: 99%
“…ST-GCN (2018) [6] 81.5 88.3 AS-GCN(2018) [9] 86.8 94.2 PB-GCN (2018) [8] 87.5 93.2 2s-AGCN(2019) [7] 88.5 95.1 AGC-LSTM(2019) [10] 89.2 95.0 ours 91.7 96.8 Table 6. The results of different methods, which are designed for 3D human activity analysis, using the cross-subject and cross-setup evaluation criteria on the NTU RGB+D 120 dataset.…”
Section: Cross-subject (%) Cross-view (%)mentioning
confidence: 99%
See 1 more Smart Citation
“…Based on this judgment, Yan et al [ 23 ] proposed a spatial-temporal graph convolutional network (ST-GCN) representing human joints as vertices and the bones as edges. The ST-GCN improves the accuracy of action recognition to a new level, and substantial ST-GCNs are subsequently proposed based on it [ 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 ]. However, there are still two problems to be addressed in these methods.…”
Section: Introductionmentioning
confidence: 99%
“…(1) Because ST-GCN [ 4 ] may not adequately capture the dependency between far-apart joints [ 5 ], it is unable to effectively extract the global co-occurrence features of actions. (2) Since convolution cannot consider the relationship between each vertex and its surrounding vertices, these related works [ 4 , 6 , 7 ] may not effectively obtain the spatial features composed of adjacent vertices. (3) These works [ 4 , 6 , 7 ] expand the number of channels per-vertex as the number of network layers increases.…”
Section: Introductionmentioning
confidence: 99%