2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017
DOI: 10.1109/cvpr.2017.486
|View full text |Cite
|
Sign up to set email alerts
|

A New Representation of Skeleton Sequences for 3D Action Recognition

Abstract: This paper presents a new method for 3D action recognition with skeleton sequences (i.e., 3D trajectories of human skeleton joints). The proposed method first transforms each skeleton sequence into three clips each consisting of several frames for spatial temporal feature learning using deep neural networks. Each clip is generated from one channel of the cylindrical coordinates of the skeleton sequence. Each frame of the generated clips represents the temporal information of the entire skeleton sequence, and i… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

4
645
1
3

Year Published

2018
2018
2023
2023

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 824 publications
(653 citation statements)
references
References 47 publications
4
645
1
3
Order By: Relevance
“…Benefiting from the merits of recurrent neural network for sequential data, some works adopt recurrent neural network to explore the spatial and temporal dynamics of skeletal data [10], [11], [13], [39]- [42]. With the help of convolutional neural networks, Ke et al [43] proposed a new representation for 3D skeleton data, which transfers action recognition problem to the problem of image classification.…”
Section: B Skeleton-based Action Recognitionmentioning
confidence: 99%
“…Benefiting from the merits of recurrent neural network for sequential data, some works adopt recurrent neural network to explore the spatial and temporal dynamics of skeletal data [10], [11], [13], [39]- [42]. With the help of convolutional neural networks, Ke et al [43] proposed a new representation for 3D skeleton data, which transfers action recognition problem to the problem of image classification.…”
Section: B Skeleton-based Action Recognitionmentioning
confidence: 99%
“…Method Accuracy Raw Skeleton [66] 49.7% Joint Feature [67] 86.9% CHARM [53] 83.9% Hierarchical RNN [68] 80.3% Deep LSTM [40] 86.3% Deep LSTM + Co-occurrence [40] 87.4% Clips + CNN + Concatenation [43] 92.8% Clips + CNN + Pooling [43] 92.2% Clips + CNN + MTLN [43] 93.5% ST-LSTM (w/o Attention) [37] 88.6% ST-LSTM (w/o Attention) + Trust Gate [37] 93.3% ST-LSTM with Attention 94.2% Table 1: Comparison of state-of-the-art action recognition models trained on SBU dataset. The results highlight the importance of the spatio-temporal attention mechanism which improves the accuracy of the ST-LSTM.…”
Section: Action Recognitionmentioning
confidence: 99%
“…In this way, they could analyze the hidden sources of information in actions. Ke et al [19] transformed skeleton sequences into clips consisting spatial temporal features. They used deep convolutional neural networks to learn long-term temporal information.…”
Section: Introductionmentioning
confidence: 99%