Proceedings of the ACM Multimedia Asia 2019
DOI: 10.1145/3338533.3366569
|View full text |Cite
|
Sign up to set email alerts
|

Make Skeleton-based Action Recognition Model Smaller, Faster and Better

Abstract: Although skeleton-based action recognition has achieved great success in recent years, most of the existing methods may suffer from a large model size and slow execution speed. To alleviate this issue, we analyze skeleton sequence properties to propose a Double-feature Double-motion Network (DD-Net) for skeleton-based action recognition. By using a lightweight network structure (i.e., 0.15 million parameters), DD-Net can reach a super fast speed, as 3,500 FPS on one GPU, or, 2,000 FPS on one CPU. By employing … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
67
0
3

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 141 publications
(70 citation statements)
references
References 34 publications
0
67
0
3
Order By: Relevance
“…Some existing studies have been considering the model complexity problem. The study [32] constructs a lightweight network with CNN-based blocks, which is not as accurate as GCN models. The work [35] adopts a complex data preprocessing strategy, whose inputs include positions, velocities, frame indexes and joint types.…”
Section: Related Workmentioning
confidence: 99%
“…Some existing studies have been considering the model complexity problem. The study [32] constructs a lightweight network with CNN-based blocks, which is not as accurate as GCN models. The work [35] adopts a complex data preprocessing strategy, whose inputs include positions, velocities, frame indexes and joint types.…”
Section: Related Workmentioning
confidence: 99%
“…The use of CNN for online signal recognition had been studied for the task of action recognition in [19] and [20]. In [19], the authors use three types of inputs relative to gesture as input to a CNN.…”
Section: Related Workmentioning
confidence: 99%
“…The use of CNN for online signal recognition had been studied for the task of action recognition in [19] and [20]. In [19], the authors use three types of inputs relative to gesture as input to a CNN. Each input is treated separately (in a branch) of the network and the features extracted are concatenated before being given as input to a last CNN which performs the classification.…”
Section: Related Workmentioning
confidence: 99%
“…Estimating human pose is the key to analyzing human behavior. HPE is the basic research in computer vision, which can be applied to many applications, such as Human-computer interaction, human action recognition [1][2][3][4], intelligent security, motion capture, and action detection [5].…”
Section: Introductionmentioning
confidence: 99%