2022
DOI: 10.48550/arxiv.2205.15936
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Skeleton-based Action Recognition via Temporal-Channel Aggregation

Abstract: Skeleton-based action recognition methods are limited by the semantic extraction of spatio-temporal skeletal maps. However, current methods have difficulty in effectively combining features from both temporal and spatial graph dimensions and tend to be thick on one side and thin on the other. In this paper, we propose a Temporal-Channel Aggregation Graph Convolutional Networks (TCA-GCN) to learn spatial and temporal topologies dynamically and efficiently aggregate topological features in different temporal and… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(10 citation statements)
references
References 38 publications
0
10
0
Order By: Relevance
“…Our proposed module is plug-and-play and compatible with most GCN-based backbones. To examine its universality, we apply it to 5 widely used GCN-based backbones [5,17,26,35,38] and evaluate them on the X-Sub and X-Set of NTU RGB+D 120 dataset. For fair comparisons, we reimplement them and use the same data preprocessing without the multi-stream fusion.…”
Section: Combined With Other Backbonesmentioning
confidence: 99%
See 1 more Smart Citation
“…Our proposed module is plug-and-play and compatible with most GCN-based backbones. To examine its universality, we apply it to 5 widely used GCN-based backbones [5,17,26,35,38] and evaluate them on the X-Sub and X-Set of NTU RGB+D 120 dataset. For fair comparisons, we reimplement them and use the same data preprocessing without the multi-stream fusion.…”
Section: Combined With Other Backbonesmentioning
confidence: 99%
“…It is noted that most of the state-of-theart methods employ a multi-stream fusion framework. For a fair comparison, we follow the same framework as [5,35]. We make a fusion with the results from four modalities including joint, bone, joint motion, and bone motion as the final report result.…”
Section: Comparison With the State-of-the-artmentioning
confidence: 99%
“…Literature [7] and literature [10] have computational costs similar to that of this paper, but the accuracy is 5.3% and 0.6% lower than that of this paper, respectively. Literature [11][12][13][14][15][16] has a higher performance in recognition accuracy, but also has a larger amount of computation and parameters. Therefore, the method proposed in this paper balances the three performance indexes of calculation cost, parameter number and recognition accuracy well, and ensures a high recognition accuracy while reducing parameter number and calculation cost.…”
Section: Comparison With Other Action Recognition Algorithmsmentioning
confidence: 99%
“…Zhou et al [ 26 ] introduced a graph attention block based on Convolutional Block Attention Module (CBAM), which is used to calculate the semantic correlation between any two joints. TCA-GCN [ 27 ] used an MS-CAM attention fusion mechanism to solve the problem of the contextual aggregation of skeletal features. However, at present, the local and global context aggregation of skeletal features and the integration of initial features are still a major difficulty, and how to effectively integrate initial features into high-dimensional features needs further research.…”
Section: Related Workmentioning
confidence: 99%