Symbiotic Graph Neural Networks for 3D Skeleton-Based Human Action Recognition and Motion Prediction

Li, Maosen; Chen, Siheng; Chen, Xu; Zhang, Ya; Wang, Yanfeng; Tian, Qi

doi:10.1109/tpami.2021.3053765

Cited by 150 publications

(81 citation statements)

References 57 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Especially in Kinetics dataset, TABLE 1. Comparisons of validation accuracy with the state-of-the-art methods on the NTU-RGB+D dataset with just joints Methods X-Sub(%) X-View(%) Lie Group [31] 50.1 52.8 HBRNN [32] 59.1 64.0 Deep LSTM [25] 60.7 67.3 ST-LSTM [33] 69.2 77.7 STA-LSTM [34] 73.4 81.2 VA-RNN [18] 79.8 88.9 VA-CNN [18] 88.7 94.3 Synthesized CNN [41] 80.0 87.2 CNN+Motion+Trans [42] 83.2 89.3 3scale ResNet [43] 85.0 92.3 ST-GCN [19] 81.5 88.3 ASGCN [20] 86.8 94.2 2s-AGCN [21] 88.0 95.1 MS-G3D [43] 89.4 95.0 Sym-GNN [44] 87.1 94.8 DSTA-Net [45] 91.5 96.4 Shift-GCN [47] 87. the top-5 accuracies show an obviously improvement. This shows that low-level features can widen the gap between similar classes.…”

Section: Effectiveness Of the Low-level Featuresmentioning

confidence: 99%

See 1 more Smart Citation

Skeleton-Based Action Recognition With Low-Level Features of Adaptive Graph Convolutional Networks

et al. 2021

View full text Add to dashboard Cite

Skeleton-based action recognition is a typical classification problem which plays a significant role in human-computer interaction and video understanding. Since a human skeleton has natural graphic features, methods based on graph convolutional networks (GCN) are widely applied in skeleton-based action recognition. Previous studies mainly focus on structural links in GCN to generate high-level features of human skeleton. However, low-level features are also important in many applications. For instance, lowlevel edge gradient and color information are important for image classificaion. This paper introduces a multi-branches structure to capture different low-level features of human skeleton. We combine both highlevel and low-level features to recognize human action. We validate our method in action recognition with two skeleton datasets, NTU-RGB+D and Kinetics. Experiment results indicate that the proposed method achieves considerable improvement over some state-of-the-art methods.

show abstract

Section: Effectiveness Of the Low-level Featuresmentioning

confidence: 99%

“…Both actional links and structural links are fixed during classification essentially. Based on [20], Li et al propose Sym-GNN [44] to capture body parts links. The main idea of Sym-GNN and AS-GCN is to determine the adjacency matrix with segmentation.…”

Section: Introductionmentioning

confidence: 99%

Skeleton-Based Action Recognition With Low-Level Features of Adaptive Graph Convolutional Networks

et al. 2021

View full text Add to dashboard Cite

show abstract

“…Yan et al [31] designed a Hierarchical Graph-based Cross Inference Network (HiG-CIN), in which three levels of information include the bodyregion level, person level, and group-activity level. Li et al [32] proposed symbiotic graph neural networks, which contain a backbone, an action-recognition head, and a motion-prediction head.…”

Section: Interaction Modelling With Shallow Modelsmentioning

confidence: 99%

Multiperson Interactive Activity Recognition Based on Interaction Relation Model

Chi

et al. 2021

Journal of Mathematics

View full text Add to dashboard Cite

Multiperson activity recognition is a pivotal branch as well as a challenging topic of human action recognition research. This paper adopts a hybrid learning model to the spatio-temporal relationship and occlusion relationship among multiple people. Initially, this paper builds up an active multiperson interaction relationship estimation framework model to capture interpersonal spatio-temporal relation. This model incorporates the interaction relationship estimation framework with the multiperson relationship network. On this ground, it automatically learns from the human-computer interaction dataset in an end-to-end manner and performs reasoning with standard matrix operations. Secondly, this paper proposed an adaptive occlusion state behavior recognition method derived from the semantic knowledge model to ravel out the concern of occlusion and self-occlusion in human action recognition. Then, Petri Nets are used to recognize multiperson interactive actions. This model has been through extensive experiments on the TV interaction dataset, Vlog dataset, AVA dataset, and MLB-YouTube dataset, experimental results have proved that the recognition performance of this model is superior than the other available models. This paper prospects and summarizes the estimation framework of the interaction relationship and occlusion semantic-knowledge relationship. Experimental results suggest that the proposed method in the paper could capture the discriminative relation information for multiperson interactive activity recognition, which further validates the efficiency of the hybrid learning model.

show abstract

“…e current mainstream researches based on ST-GCN improve the recognition accuracy of skeleton recognition task by multistream input [37], adding optimization module, improving loss function [38], improving convolution kernel [24,39], and increasing attention [34]. ese methods make the network deeper and the structure of each layer more complex; they often introduce many parameters and extremely difficult training processes and frequently require many computing resources and long training times.…”

Section: Introductionmentioning

confidence: 99%

A Lightweight Hierarchical Model with Frame-Level Joints Adaptive Graph Convolution for Skeleton-Based Action Recognition

Jiang

Yang

Liu

et al. 2021

Security and Communication Networks

View full text Add to dashboard Cite

In skeleton-based human action recognition methods, human behaviours can be analysed through temporal and spatial changes in the human skeleton. Skeletons are not limited by clothing changes, lighting conditions, or complex backgrounds. This recognition method is robust and has aroused great interest; however, many existing studies used deep-layer networks with large numbers of required parameters to improve the model performance and thus lost the advantage of less computation of skeleton data. It is difficult to deploy previously established models to real-life applications based on low-cost embedded devices. To obtain a model with fewer parameters and a higher accuracy, this study designed a lightweight frame-level joints adaptive graph convolutional network (FLAGCN) model to solve skeleton-based action recognition tasks. Compared with the classical 2s-AGCN model, the new model obtained a higher precision with 1/8 of the parameters and 1/9 of the floating-point operations (FLOPs). Our proposed network characterises three main improvements. First, a previous feature-fusion method replaces the multistream network and reduces the number of required parameters. Second, at the spatial level, two kinds of graph convolution methods capture different aspects of human action information. A frame-level graph convolution constructs a human topological structure for each data frame, whereas an adjacency graph convolution captures the characteristics of the adjacent joints. Third, the model proposed in this study hierarchically extracts different levels of action sequence features, making the model clear and easy to understand; further, it reduces the depth of the model and the number of parameters. A large number of experiments on the NTU RGB + D 60 and 120 data sets show that this method has the advantages of few required parameters, low computational costs, and fast speeds. It also has a simple structure and training process that make it easy to deploy in real-time recognition systems based on low-cost embedded devices.

show abstract

Symbiotic Graph Neural Networks for 3D Skeleton-Based Human Action Recognition and Motion Prediction

Cited by 150 publications

References 57 publications

Skeleton-Based Action Recognition With Low-Level Features of Adaptive Graph Convolutional Networks

Skeleton-Based Action Recognition With Low-Level Features of Adaptive Graph Convolutional Networks

Multiperson Interactive Activity Recognition Based on Interaction Relation Model

A Lightweight Hierarchical Model with Frame-Level Joints Adaptive Graph Convolution for Skeleton-Based Action Recognition

Contact Info

Product

Resources

About