2022
DOI: 10.1609/aaai.v36i1.19998
|View full text |Cite
|
Sign up to set email alerts
|

Towards To-a-T Spatio-Temporal Focus for Skeleton-Based Action Recognition

Abstract: Graph Convolutional Networks (GCNs) have been widely used to model the high-order dynamic dependencies for skeleton-based action recognition. Most existing approaches do not explicitly embed the high-order spatio-temporal importance to joints’ spatial connection topology and intensity, and they do not have direct objectives on their attention module to jointly learn when and where to focus on in the action sequence. To address these problems, we propose the To-a-T Spatio-Temporal Focus (STF), a skeleton-based … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 33 publications
(7 citation statements)
references
References 37 publications
(85 reference statements)
0
7
0
Order By: Relevance
“…Moreover, compared to GCN-based methods, the performance of our FG-STFormer is also at the top. It compares favourably with current state-of-the-art STF [17] and CTR-GCN [5] on NTU-60 and NTU-120, and even outperforms the latter on NW-UCLA by 0.5%, verifying the effectiveness of FG-STFormer.…”
Section: Comparison With the State-of-the-artsmentioning
confidence: 62%
“…Moreover, compared to GCN-based methods, the performance of our FG-STFormer is also at the top. It compares favourably with current state-of-the-art STF [17] and CTR-GCN [5] on NTU-60 and NTU-120, and even outperforms the latter on NW-UCLA by 0.5%, verifying the effectiveness of FG-STFormer.…”
Section: Comparison With the State-of-the-artsmentioning
confidence: 62%
“…ST-GCN obtains impressive recognition improvement, however, the predefined spatial graph makes it difficult to capture abstract relationships between distant joints. To overcome this limitation, many methods [14,22,23] are proposed with various kinds of learnable graph modules or attention mechanisms. Shi et al [13] propose the Multi-stream Attention-enhanced Adaptive Graph Convolutional Network (MS-AAGCN).…”
Section: Skeleton-based Action Recognitionmentioning
confidence: 99%
“…GCN [6] is the first GCN-based recognition model. After that, more kinds of improvements [21][22][23] are proposed with various adaptive graph modules or attention mechanisms.…”
mentioning
confidence: 99%
“…InfoGCN [ 8 ] develops an information rate-based framework for learning objectives. STF [ 10 ] provides a flexible framework for learning spatio-temporal gradients for skeleton-based action recognition. However, many advanced methods spend a great deal of effort on spatial feature extraction, ignoring the extraction of temporal features, and simply use the same multi-scale feature extraction module in each layer of the network.…”
Section: Related Workmentioning
confidence: 99%
“…GCN modules close to the network output tend to have larger receptive fields and can capture more contextual information. In addition, the receptive field of the GCN module close to the network input is relatively small [ 10 ]. It can be concluded that different layers have different effects on skeleton recognition.…”
Section: Introductionmentioning
confidence: 99%