2020
DOI: 10.15625/2525-2518/58/4/14742
|View full text |Cite
|
Sign up to set email alerts
|

Two-Stream Convolutional Network for Dynamic Hand Gesture Recognition Using Convolutional Long Short-Term Memory Networks

Abstract: Human action and gesture recognition provides important and worth information for interaction between human and device ambient that monitors living, healthcare facilities or entertainment activities in smart homes. Recent years, there were many machine learning model application studies to recognize human action and gesture. In this paper, we propose a dynamic hand gesture recognition system in video based on two stream-convolution network (ConvNet) architecture. Specifically, we research the state-of-the-art … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
9
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(9 citation statements)
references
References 9 publications
0
9
0
Order By: Relevance
“…Nguyen et al proposed that the dance movement can be regarded as a continuous set of postures, so it needs to be represented by the information of multiple images. In order to extract human motion features from the preprocessed images, not only should the feature information in the image at a certain time be extracted, but also the correlation between the motion features of adjacent images should be established [ 15 ]. Thomas et al state that the purpose of human dance-specific action recognition is to analyze and understand human body actions and behaviors in video.…”
Section: Related Workmentioning
confidence: 99%
“…Nguyen et al proposed that the dance movement can be regarded as a continuous set of postures, so it needs to be represented by the information of multiple images. In order to extract human motion features from the preprocessed images, not only should the feature information in the image at a certain time be extracted, but also the correlation between the motion features of adjacent images should be established [ 15 ]. Thomas et al state that the purpose of human dance-specific action recognition is to analyze and understand human body actions and behaviors in video.…”
Section: Related Workmentioning
confidence: 99%
“…To summarize, it is evident that datasets depicting more realistic conditions have a lower accuracy: datasets IPN Hand and IsoGD received the lowest accuracy with 62.14% [42], 67.71% [43], 82.90% and 85.10% [34]. All studies using less complex datasets achieved an accuracy above 90.00% [36], [38], [42].…”
Section: B Gesture Recognition With Deep Learningmentioning
confidence: 90%
“…Nguyen et al [36] further developed this approach on two streams of CNN architecture, allowing the system to consider both the spatial and temporal aspects of the hand gestures in videos. Hand gestures often involve complex movements and disproportions of the hand.…”
Section: B Gesture Recognition With Deep Learningmentioning
confidence: 99%
See 1 more Smart Citation
“…Nguyen et al [19] proposed a two-stream convolution network model on 6 classes out of 25 using the 20BN-jester dataset. MobileNet-V2 followed by LSTM was used for spatio-temporal features extraction.…”
Section: Related Workmentioning
confidence: 99%