2022
DOI: 10.48550/arxiv.2208.03775
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Video-based Human Action Recognition using Deep Learning: A Review

Abstract: Human action recognition is an important application domain in computer vision. Its primary aim is to accurately describe human actions and their interactions from a previously unseen data sequence acquired by sensors. The ability to recognize, understand and predict complex human actions enables the construction of many important applications such as intelligent surveillance systems, human-computer interfaces, health care, security and military applications. In recent years, deep learning has been given parti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 173 publications
(293 reference statements)
0
7
0
Order By: Relevance
“…action recognition [8] stem from multiple factors: complex backgrounds, variations in human body shapes, changing viewpoints, or motion speed alterations. In contrast to videobased action recognition [9]- [11], skeleton-based action recognition is less sensitive to appearance factors and has the advantage of superior efficiency by using sparse 3D skeleton data as input, which ensures fast inference speed and small memory usage. Thanks to the advancement of depth sensors [12] and lightweight and robust pose estimation algorithms [13], [14], obtaining high-quality skeleton data is becoming easier.…”
Section: Introductionmentioning
confidence: 99%
“…action recognition [8] stem from multiple factors: complex backgrounds, variations in human body shapes, changing viewpoints, or motion speed alterations. In contrast to videobased action recognition [9]- [11], skeleton-based action recognition is less sensitive to appearance factors and has the advantage of superior efficiency by using sparse 3D skeleton data as input, which ensures fast inference speed and small memory usage. Thanks to the advancement of depth sensors [12] and lightweight and robust pose estimation algorithms [13], [14], obtaining high-quality skeleton data is becoming easier.…”
Section: Introductionmentioning
confidence: 99%
“…Shu et al in [ 6 ] presented the Omni-Training framework, which connects pretraining and metatraining for effective few-shot learning with limited data. These algorithms can learn complex patterns and relationships among different features of human actions, making them more effective in recognizing and classifying actions than traditional ML approaches [ 7 ].…”
Section: Introductionmentioning
confidence: 99%
“…Then, these features are used to recognize actions. As a result, recognizing an action based on features can be viewed as a classification problem [ 7 ].…”
Section: Introductionmentioning
confidence: 99%
“…However, extending this task to drone-captured images and videos is an emerging topic. Human action recognition is a well-studied problem which is categorized into (i) posebased [2][3][4], (ii) single-image-based [5][6][7], and (iii) video-based action recognition [8][9][10]. However, detecting actions in single images is a less explored area because it faces the problem of the unavailability of annotated temporal data for action detection [11][12][13].…”
Section: Introductionmentioning
confidence: 99%
“…Pareek et al [9] briefly covered the human action recognition techniques using machine learning and deep learning methods for the years 2011-2019. Pham et al [10] presented most important deep learning models for action recognition, analyzed their performance, and then discussed future prospects and challenges for recognizing actions in realistic videos. Rohrbach et al [11] prepared a dataset for kitchen activities.…”
Section: Introductionmentioning
confidence: 99%