2018
DOI: 10.1016/j.cviu.2018.03.003
|View full text |Cite
|
Sign up to set email alerts
|

Exploiting deep residual networks for human action recognition from skeletal data

Abstract: The computer vision community is currently focusing on solving action recognition problems in real videos, which contain thousands of samples with many challenges. In this process, Deep Convolutional Neural Networks (D-CNNs) have played a significant role in advancing the state-of-the-art in various vision-based action recognition systems. Recently, the introduction of residual connections in conjunction with a more traditional CNN model in a single architecture called Residual Network (ResNet) has shown impre… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
43
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 68 publications
(45 citation statements)
references
References 61 publications
0
43
0
Order By: Relevance
“…Generally speaking, four hypotheses that motivate us to build a skeleton-based representation and design DenseNets for 3D HAR include: (1) human actions can be correctly represented via movements of the skeleton [16]; (2) spatio-temporal evolutions of skeletons can be transformed into color images -a kind of 3D tensor that can be effectively learned by D-CNNs [1,5,3]. This hypothesis was proved in our previous studies [27,28,29]; (3) compared to RGB and depth modalities, skeletal data has high-level information with much less complexity. This makes the learning model much simpler and requiring less computation, allowing us to build real-time deep learning framework for HAR task; (4) DenseNet is currently one of the most effective CNN architecture for image recognition.…”
Section: Introductionmentioning
confidence: 81%
See 1 more Smart Citation
“…Generally speaking, four hypotheses that motivate us to build a skeleton-based representation and design DenseNets for 3D HAR include: (1) human actions can be correctly represented via movements of the skeleton [16]; (2) spatio-temporal evolutions of skeletons can be transformed into color images -a kind of 3D tensor that can be effectively learned by D-CNNs [1,5,3]. This hypothesis was proved in our previous studies [27,28,29]; (3) compared to RGB and depth modalities, skeletal data has high-level information with much less complexity. This makes the learning model much simpler and requiring less computation, allowing us to build real-time deep learning framework for HAR task; (4) DenseNet is currently one of the most effective CNN architecture for image recognition.…”
Section: Introductionmentioning
confidence: 81%
“…One of the major challenges in exploiting D-CNNs for skeleton-based action recognition is how a skeleton sequence could be effectively represented and fed to the deep networks. As D-CNNs work well on still images [18], our idea therefore is to encode the spatial and temporal dynamics of skeletons into 2D images [28,29]. Two essential elements for describing an action are static poses and their temporal dynamics.…”
Section: Enhanced Skeleton Pose-motion Featurementioning
confidence: 99%
“…These intermediate representations are then fed to a Deep Convolutional Neural Network (D-CNNs) for learning and classifying actions. This idea has been proven effective in [51,52,53]. Thus, the spatio-temporal patterns of a 3D pose sequence are transformed into a single color image as a global representation called Enhanced-SPMF [53] via two important elements of a human movement: 3D poses and their motions.…”
Section: D Pose-based Action Recognitionmentioning
confidence: 99%
“…For video object detection and classification, several other neural networks have been proposed. The ResNet architecture uses RGB images with encoded spatial-temporal features extracted from 3D skeleton keypoints [12,21]. In 2018, the ResNet model was extended, under five distinct architectures, in [22].…”
Section: Related Workmentioning
confidence: 99%