2020
DOI: 10.48550/arxiv.2010.12221
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Temporal Attention-Augmented Graph Convolutional Network for Efficient Skeleton-Based Human Action Recognition

Abstract: Graph convolutional networks (GCNs) have been very successful in modeling non-Euclidean data structures, like sequences of body skeletons forming actions modeled as spatiotemporal graphs. Most GCN-based action recognition methods use deep feed-forward networks with high computational complexity to process all skeletons in an action. This leads to a high number of floating point operations (ranging from 16G to 100G FLOPs) to process a single sample, making their adoption in restricted computation application sc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 27 publications
0
4
0
Order By: Relevance
“…Some approaches argue to use the temporal attention to select the most informative frames or joints that are instructive in temporal dimension. N. Heidari et al [ 130 ] proposed a temporal attention module (TAM) to increase the efficiency in skeleton-based action recognition. It selects the most informative skeletons of an action, in other words, skeletons corresponding to the top highest attention values at the shallow layers of the network.…”
Section: The Common Frameworkmentioning
confidence: 99%
See 2 more Smart Citations
“…Some approaches argue to use the temporal attention to select the most informative frames or joints that are instructive in temporal dimension. N. Heidari et al [ 130 ] proposed a temporal attention module (TAM) to increase the efficiency in skeleton-based action recognition. It selects the most informative skeletons of an action, in other words, skeletons corresponding to the top highest attention values at the shallow layers of the network.…”
Section: The Common Frameworkmentioning
confidence: 99%
“…Green texts annotate the methods measured by ‘Params(M)’, and blue texts annotate ‘FLOPs (G)/Action’. The numbers around dots denote [ 13 , 14 , 50 , 54 , 55 , 57 , 59 , 62 , 73 , 77 , 80 , 82 , 85 , 86 , 89 , 110 , 116 , 117 , 121 , 127 , 130 , 135 , 141 , 183 ] respectively in ascending order. Number 94 [ 54 ] is a remarkable one, with both relatively low complexity and small model size.…”
Section: Figurementioning
confidence: 99%
See 1 more Smart Citation
“…Recently, GNN or GCN have been demonstrated to be a more principled and effective choice for parsing the graph structure of skeleton data [10], which enables innercorrelation capture without segmentation of the whole body. Another example, TA-GCN [11], proposes to use an attention module in a GCN-based spatial-temporal model to extract more useful predictive features from graph data. However these works are focused on only action recognition so that it could not exploit available motion data to enhance the recognition ability.…”
Section: B Skeleton-based Human Action Recognitionmentioning
confidence: 99%