2019
DOI: 10.48550/arxiv.1905.13388
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Design Light-weight 3D Convolutional Networks for Video Recognition Temporal Residual, Fully Separable Block, and Fast Algorithm

Haonan Wang,
Jun Lin,
Zhongfeng Wang

Abstract: Deep 3-dimensional (3D) Convolutional Network (Con-vNet) has shown promising performance on video recognition tasks because of its powerful spatio-temporal information fusion ability. However, the extremely intensive requirements on memory access and computing power prohibit it from being used in resource-constrained scenarios, such as portable and edge devices. So in this paper, we first propose a two-stage fully separable block to significantly compress the model sizes with little accuracy loss. Then a featu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2019
2019
2019
2019

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 27 publications
0
2
0
Order By: Relevance
“…Efficient spatiotemporal feature computation: Numerous works have developed approaches to make video processing more efficient, either through using less evidence [59,1,3,39] or through more efficient processing [40,43,35,60,52,50,27,28].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Efficient spatiotemporal feature computation: Numerous works have developed approaches to make video processing more efficient, either through using less evidence [59,1,3,39] or through more efficient processing [40,43,35,60,52,50,27,28].…”
Section: Related Workmentioning
confidence: 99%
“…Due to this representational power they are currently the best performing models on action related tasks like action recognition [42,21,14,6], action quality assessment [31,30], skills assessment [5], action detection [12]. This representation power comes at a cost of increased computational complexity [60,50,58,13]. Hadidi et al [13] recently conducted an exhaustive comparison of various CNN's from the perspective of computational cost.…”
Section: Introductionmentioning
confidence: 99%