2020 25th International Conference on Pattern Recognition (ICPR) 2021
DOI: 10.1109/icpr48806.2021.9413189
|View full text |Cite
|
Sign up to set email alerts
|

Vertex Feature Encoding and Hierarchical Temporal Modeling in a Spatio-Temporal Graph Convolutional Network for Action Recognition

Abstract: This paper extends the Spatial-Temporal Graph Convolutional Network (ST-GCN) for skeleton-based action recognition by introducing two novel modules, namely, the Graph Vertex Feature Encoder (GVFE) and the Dilated Hierarchical Temporal Convolutional Network (DH-TCN). On the one hand, the GVFE module learns appropriate vertex features for action recognition by encoding raw skeleton data into a new feature space. On the other hand, the DH-TCN module is capable of capturing both short-term and long-term temporal d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
8
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
2

Relationship

1
8

Authors

Journals

citations
Cited by 12 publications
(8 citation statements)
references
References 39 publications
0
8
0
Order By: Relevance
“…For example, the FGCN model outperforms Two-Stream Attention LSTM [61] by over 24% on both the cross-subject and cross-setup benchmarks. Our FGCN model outperforms the most recent methods, such as GVFE + AS-GCN [63] and ST-TR [66] on both the cross-subject and cross-setup benchmarks of the NTU-RGB+D120 dataset.…”
Section: Modelsmentioning
confidence: 85%
See 1 more Smart Citation
“…For example, the FGCN model outperforms Two-Stream Attention LSTM [61] by over 24% on both the cross-subject and cross-setup benchmarks. Our FGCN model outperforms the most recent methods, such as GVFE + AS-GCN [63] and ST-TR [66] on both the cross-subject and cross-setup benchmarks of the NTU-RGB+D120 dataset.…”
Section: Modelsmentioning
confidence: 85%
“…When FGCN is fed with more observations of actions in the subsequent stages, it gets higher accuracies. [64] 67.9 62.8 Shift-GCN (2-stream) (CVPR 2020) [65] 85.3 86.6 Shift-GCN (4-stream) (CVPR 2020) [65] 85.9 87.6 ST-TR (CVIU 2021) [66] 85.1 87.1 GVFE + AS-GCN (ICPR 2021) [63] 79.2 81.2 FGCN (ours) 85.4 87.4…”
Section: Modelsmentioning
confidence: 99%
“…Instead of only use raw skeleton features (joint coordinates and/or bone lengths) like all above GCN‐based methods to construct the spatial‐temporal graphs, Papadopoulos et al. [58] introduced a more compact and efficient graph‐based framework to solve their certain limitations. It includes two modules, the Graph Vertex Feature Encoder (GVFE) module (Figure 10) learned appropriate vertex features by encoding raw skeleton data into a new feature space and the Dilated Hierarchical Temporal Convolutional Network (DH‐TCN) module was capable of capturing both short‐term and long‐term temporal dependencies using a hierarchical dilated convolutional network.…”
Section: Deep Learning‐based Action Recognition With 3d Skeletonmentioning
confidence: 99%
“…Cai et al [1] proposes to add flow patches to handle subtle movements into a GCN. Approaches based on GCN [6,23,37,47] have been constantly improving the state-of-the-art on skeletonbased action recognition recently.…”
Section: Skeleton-based Action Recognitionmentioning
confidence: 99%