A Comparative Review of Recent Kinect-Based Action Recognition Algorithms

Wang, Lei; Huynh, Du Q.; Koniusz, Piotr

doi:10.1109/tip.2019.2925285

Cited by 218 publications

(110 citation statements)

References 70 publications

(290 reference statements)

Supporting

Mentioning

109

Contrasting

Unclassified

Order By: Relevance

“…There are other alternatives to handle this problem like the use of a Kinect camera [57], that can be helpful for recognition of the action as well as the summarization of it. But, the occlusion and action recognition in a crowd scene still represent a challenge [58,[62][63][64].…”

Section: Action Recognitionmentioning

confidence: 99%

A combined multiple action recognition and summarization for surveillance video sequences

et al. 2020

View full text Add to dashboard Cite

Human action recognition and video summarization represent challenging tasks for several computer vision applications including video surveillance, criminal investigations, and sports applications. For long videos, it is difficult to search within a video for a specific action and/or person. Usually, human action recognition approaches presented in the literature deal with videos that contain only a single person, and they are able to recognize his action. This paper proposes an effective approach to multiple human action detection, recognition, and summarization. The multiple action detection extracts human bodies' silhouette, then generates a specific sequence for each one of them using motion detection and tracking method. Each of the extracted sequences is then divided into shots that represent homogeneous actions in the sequence using the similarity between each pair frames. Using the histogram of the oriented gradient (HOG) of the Temporal Difference Map (TDMap) of the frames of each shot, we recognize the action by performing a comparison between the generated HOG and the existed HOGs in the training phase which represents all the HOGs of many actions using a set of videos for training. Also, using the TDMap images we recognize the action using a proposed CNN model. Action summarization is performed for each detected person. The efficiency of the proposed approach is shown through the obtained results for mainly multi-action detection and recognition.

show abstract

Section: Action Recognitionmentioning

confidence: 99%

A combined multiple action recognition and summarization for surveillance video sequences

et al. 2020

View full text Add to dashboard Cite

show abstract

“…''Human action or activity recognition has played significant roles in many potential applications, including security surveillance, human-computer interaction (HCI), health monitoring and intelligent transportation [1]- [6]. For instance, in healthcare environments, by monitoring the behavior of people and recognizing human activities, the activity habits and patterns of people can be understood.…”

Section: Introductionmentioning

confidence: 99%

Multi-Modal Human Action Recognition With Sub-Action Exploiting and Class-Privacy Preserved Collaborative Representation Learning

Liang

Liu

et al. 2020

IEEE Access

View full text Add to dashboard Cite

Multimodal human action recognition with depth sensors has drawn wide attention, due to its potential applications such as health-care monitoring, smart buildings/home, intelligent transportation, and security surveillance. As one of the obstacles of robust action recognition, sub-actions sharing, especially among similar action categories, makes human action recognition more challenging. This paper proposes a segmental architecture to exploit the relations of sub-actions, jointly with heterogeneous information fusion and Class-privacy Preserved Collaborative Representation (CPPCR) for multi-modal human action recognition. Specifically, a segmental architecture is proposed based on the normalized action motion energy. It models long-range temporal structure over video sequences to better distinguish the similar actions bearing sub-action sharing phenomenon. The sub-action based depth motion and skeleton features are then extracted and fused. Moreover, by introducing within-class local consistency into Collaborative Representation (CR) coding, CPPCR is proposed to address the challenging sub-action sharing phenomenon, learning the high-level discriminative representation. Experiments on four datasets demonstrate the effectiveness of the proposed method.

show abstract

“…Human action recognition has a wide range of applications [1], such as human-computer interaction, video surveillance, health care, entertainment, etc. Its application has become one of the research hotspots in the field of computer vision [2].…”

Section: Introductionmentioning

confidence: 99%

Global Spatio-Temporal Attention for Action Recognition Based on 3D Human Skeleton Data

et al. 2020

View full text Add to dashboard Cite

The human skeleton joints captured by RGB-D camera are widely used in action recognition for its robust and comprehensive 3D information. Presently, most action recognition methods based on skeleton joints treat all skeletal joints with the same importance spatially and temporally. However, the contributions of skeletal joints vary significantly. Hence, a GL-LSTM+Diff model is proposed to improve the recognition of human actions. A global spatial attention (GSA) model is proposed to express the different weights for different skeletal joints to provide precise spatial information for human action recognition. The accumulative learning curve (ALC) model is introduced to highlight which frames contribute most to the final decision making by giving varying temporal weights to each intermediate accumulated learning results. By integrating the proposed GSA (for spatial information) and ALC (for temporal processing) models into the LSTM framework and taking the human skeletal joints as inputs, a global spatio-temporal action recognition framework (GL-LSTM) is constructed to recognize human actions. Diff is introduced as the preprocessing method to enhance the dynamic of the features, thus to get distinguishable features in deep learning. Rigorous experiments on the largest dataset NTU RGB+D and the common small dataset SBU show that the algorithm proposed in this paper outperforms other state-of-the-art methods.

show abstract

A Comparative Review of Recent Kinect-Based Action Recognition Algorithms

Cited by 218 publications

References 70 publications

A combined multiple action recognition and summarization for surveillance video sequences

A combined multiple action recognition and summarization for surveillance video sequences

Multi-Modal Human Action Recognition With Sub-Action Exploiting and Class-Privacy Preserved Collaborative Representation Learning

Global Spatio-Temporal Attention for Action Recognition Based on 3D Human Skeleton Data

Contact Info

Product

Resources

About