A New Representation of Skeleton Sequences for 3D Action Recognition

Ke, Qiuhong; Bennamoun, Mohammed; An, Senjian; Sohel, Ferdous; Boussaid, Farid

doi:10.1109/cvpr.2017.486

Cited by 824 publications

(653 citation statements)

References 47 publications

Supporting

Mentioning

645

Contrasting

Unclassified

Order By: Relevance

“…Benefiting from the merits of recurrent neural network for sequential data, some works adopt recurrent neural network to explore the spatial and temporal dynamics of skeletal data [10], [11], [13], [39]- [42]. With the help of convolutional neural networks, Ke et al [43] proposed a new representation for 3D skeleton data, which transfers action recognition problem to the problem of image classification.…”

Section: B Skeleton-based Action Recognitionmentioning

confidence: 99%

Modality Compensation Network: Cross-Modal Adaptation for Action Recognition

Song

Liu

et al. 2020

IEEE Trans. on Image Process.

View full text Add to dashboard Cite

With the prevalence of RGB-D cameras, multimodal video data have become more available for human action recognition. One main challenge for this task lies in how to effectively leverage their complementary information. In this work, we propose a Modality Compensation Network (MCN) to explore the relationships of different modalities, and boost the representations for human action recognition. We regard RGB/optical flow videos as source modalities, skeletons as auxiliary modality. Our goal is to extract more discriminative features from source modalities, with the help of auxiliary modality. Built on deep Convolutional Neural Networks (CNN) and Long Short Term Memory (LSTM) networks, our model bridges data from source and auxiliary modalities by a modality adaptation block to achieve adaptive representation learning, that the network learns to compensate for the loss of skeletons at test time and even at training time. We explore multiple adaptation schemes to narrow the distance between source and auxiliary modal distributions from different levels, according to the alignment of source and auxiliary data in training. In addition, skeletons are only required in the training phase. Our model is able to improve the recognition performance with source data when testing. Experimental results reveal that MCN outperforms stateof-the-art approaches on four widely-used action recognition benchmarks.

show abstract

Section: B Skeleton-based Action Recognitionmentioning

confidence: 99%

Modality Compensation Network: Cross-Modal Adaptation for Action Recognition

Song

Liu

et al. 2020

IEEE Trans. on Image Process.

View full text Add to dashboard Cite

show abstract

“…Method Accuracy Raw Skeleton [66] 49.7% Joint Feature [67] 86.9% CHARM [53] 83.9% Hierarchical RNN [68] 80.3% Deep LSTM [40] 86.3% Deep LSTM + Co-occurrence [40] 87.4% Clips + CNN + Concatenation [43] 92.8% Clips + CNN + Pooling [43] 92.2% Clips + CNN + MTLN [43] 93.5% ST-LSTM (w/o Attention) [37] 88.6% ST-LSTM (w/o Attention) + Trust Gate [37] 93.3% ST-LSTM with Attention 94.2% Table 1: Comparison of state-of-the-art action recognition models trained on SBU dataset. The results highlight the importance of the spatio-temporal attention mechanism which improves the accuracy of the ST-LSTM.…”

Section: Action Recognitionmentioning

confidence: 99%

Human Action Performance Using Deep Neuro-Fuzzy Recurrent Attention Model

et al. 2020

View full text Add to dashboard Cite

A great number of computer vision publications have focused on distinguishing between human action recognition and classification rather than the intensity of actions performed. Indexing the intensity which determines the performance of human actions is a challenging task due to the uncertainty and information deficiency that exists in the video inputs. To remedy this uncertainty, in this paper we coupled fuzzy logic rules with the neural-based action recognition model to rate the intensity of a human action as intense or mild. In our approach, we used a Spatio-Temporal LSTM to generate the weights of the fuzzy-logic model, and then demonstrate through experiments that indexing of the action intensity is possible. We analyzed the integrated model by applying it to videos of human actions with different action intensities and were able to achieve an accuracy of 89.16% on our intensity indexing generated dataset. The integrated model demonstrates the ability of a neuro-fuzzy inference module to effectively estimate the intensity index of human actions.

show abstract

“…In this way, they could analyze the hidden sources of information in actions. Ke et al [19] transformed skeleton sequences into clips consisting spatial temporal features. They used deep convolutional neural networks to learn long-term temporal information.…”

Section: Introductionmentioning

confidence: 99%

Recognizing Involuntary Actions from 3D Skeleton Data Using Body States

2018

View full text Add to dashboard Cite

Abstract-Human action recognition has been one of the most active fields of research in computer vision over the last years. Two dimensional action recognition methods are facing serious challenges such as occlusion and missing the third dimension of data. Development of depth sensors has made it feasible to track positions of human body joints over time. This paper proposes a novel method for action recognition which uses temporal 3D skeletal Kinect data. This method introduces the definition of body states and then every action is modeled as a sequence of these states. The learning stage uses Fisher Linear Discriminant Analysis (LDA) to construct discriminant feature space for discriminating the body states. Moreover, this paper suggests the use of the Mahalonobis distance as an appropriate distance metric for the classification of the states of involuntary actions. Hidden Markov Model (HMM) is then used to model the temporal transition between the body states in each action. According to the results, this method significantly outperforms other popular methods, with recognition (recall) rate of 88.64% for eight different actions and up to 96.18% for classifying the class of all fall actions versus normal actions.Index Terms-Human action recognition, involuntary action recognition, Fisher, linear discriminant analysis (LDA), kinect, 3D skeleton data, hidden markov model (HMM).

show abstract

A New Representation of Skeleton Sequences for 3D Action Recognition

Cited by 824 publications

References 47 publications

Modality Compensation Network: Cross-Modal Adaptation for Action Recognition

Modality Compensation Network: Cross-Modal Adaptation for Action Recognition

Human Action Performance Using Deep Neuro-Fuzzy Recurrent Attention Model

Recognizing Involuntary Actions from 3D Skeleton Data Using Body States

Contact Info

Product

Resources

About