2022
DOI: 10.1155/2022/8323962
|View full text |Cite
|
Sign up to set email alerts
|

A Comprehensive Review of Recent Deep Learning Techniques for Human Activity Recognition

Abstract: Human action recognition is an important field in computer vision that has attracted remarkable attention from researchers. This survey aims to provide a comprehensive overview of recent human action recognition approaches based on deep learning using RGB video data. Our work divides recent deep learning-based methods into five different categories to provide a comprehensive overview for researchers who are interested in this field of computer vision. Moreover, a pure-transformer architecture (convolution-free… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
16
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 36 publications
(16 citation statements)
references
References 82 publications
(172 reference statements)
0
16
0
Order By: Relevance
“…Because skeleton data formats provide more compact information with only 3D coordinate values, they are easily manageable and can be used to build HAR methods for lightweight devices. The upcoming subsections describe state-of-the-art HAR methods introduced using 3D skeleton data modalities, as listed in Table 1 in terms of category, year of publication, method, and dataset [ 21 ].…”
Section: State-of-the-art Methodsmentioning
confidence: 99%
“…Because skeleton data formats provide more compact information with only 3D coordinate values, they are easily manageable and can be used to build HAR methods for lightweight devices. The upcoming subsections describe state-of-the-art HAR methods introduced using 3D skeleton data modalities, as listed in Table 1 in terms of category, year of publication, method, and dataset [ 21 ].…”
Section: State-of-the-art Methodsmentioning
confidence: 99%
“…The LSTM is an advanced variant of RNN with the capability of preserving long-term dependencies by using internal feedback [ 22 ]. Essentially, the LSTM layers prevent older information from gradually vanishing [ 23 ].…”
Section: Machine Learning Algorithms-deep Learningmentioning
confidence: 99%
“…Moreover, the CNN model pooling layer loses a lot of valuable information, and the lack of a memory function as well as the limited data size and high computational requirements are other shortcomings of CNN [ 41 ] (Table 1 ). The Elman RNN has shown good ability in capturing the dynamics of sequences via recurrent connection, e.g., as in natural language processing [ 22 ]. The Elman RNN is effective and shows good generalization ability and has been widely used for solving practical problems [ 21 ].…”
Section: Machine Learning Algorithms-traditional Machine Learning Alg...mentioning
confidence: 99%
“…Since then, several networks that utilise additional modalities, such as motion saliency (Zong et al, 2021) and audio (Wang et al, 2021), have been introduced. Recently, the introduction of pose, which is critical for the perception of actions (Le et al, 2022), has shown promising results in multi-stream architectures (Hong et al, 2019;Hayakawa and Dariush, 2020;Duan et al, 2021;Li et al, 2022). In particular, the DensePose format provides an opportunity to exploit fine-grained, segmentation map-based pose representations for action recognition.…”
Section: Related Workmentioning
confidence: 99%