2019
DOI: 10.1111/coin.12207
|View full text |Cite
|
Sign up to set email alerts
|

Deep spatiotemporal LSTM network with temporal pattern feature for 3D human action recognition

Abstract: With the rapid development of RGB‐D cameras and pose estimation techniques, action recognition based on three‐dimensional skeleton data has gained significant attention in the artificial intelligence community. In this paper, we incorporate temporal pattern descriptors of joint positions with the currently popular long short‐term memory (LSTM)–based learning scheme to obtain accurate and robust action recognition. Considering that actions are essentially formed by small subactions, we first utilize a two‐dimen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(6 citation statements)
references
References 68 publications
0
6
0
Order By: Relevance
“…First, the objects z 1 and z 2 in the surrounding environment are measured; then, the distance between the dynamic human motion and the marker is calculated to achieve targeting localization, and finally, the human localization of motion, object localization, and the 3D spatial information e 2 are accurately calculated. At the moment k, from the estimated camera gesture z k , the results are shown in (4).…”
Section: Tracking Object Of Determinedmentioning
confidence: 99%
See 1 more Smart Citation
“…First, the objects z 1 and z 2 in the surrounding environment are measured; then, the distance between the dynamic human motion and the marker is calculated to achieve targeting localization, and finally, the human localization of motion, object localization, and the 3D spatial information e 2 are accurately calculated. At the moment k, from the estimated camera gesture z k , the results are shown in (4).…”
Section: Tracking Object Of Determinedmentioning
confidence: 99%
“…Relevant researchers worldwide have made some achievements in the research of this field, which have been applied to various other fields [3]. For instance, in robot vision, the camera can be used to recognize objects; thus, finally, the target object can be grasped [4]. In addition, in some disaster rescue, accurate positioning of the trapped can be realized through target tracking, thus shortening the time for search and rescue [5].…”
Section: Introductionmentioning
confidence: 99%
“…It can be seen that whether an athlete can complete a highquality whipping action in a game directly affects the outcome of the game [4]. According to the technical statistics of the 216 men's games in the 2010 National Sanda Championships, the use of whip legs accounted for 78.25% of all leg techniques and 39.51% of the entire Sanda technique [5]. The use rate of leg techniques was significantly higher than that of any other movements.…”
Section: Introductionmentioning
confidence: 99%
“…The fusion approach in [60] proposes to learn spatial features of individual 3D skeletons using CNN and then train an LSTM network on top of such features. In [61], multi-modal features are first extracted from the input actions and then fused by an autoencoder network. In [62], the authors propose to fuse the RGB and 3D skeleton modalities.…”
Section: ) Fusion Methodsmentioning
confidence: 99%