2020
DOI: 10.48550/arxiv.2011.08652
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

3D CNNs with Adaptive Temporal Feature Resolutions

Abstract: While state-of-the-art 3D Convolutional Neural Networks (CNN) achieve very good results on action recognition datasets, they are computationally very expensive and require many GFLOPs. While the GFLOPs of a 3D CNN can be decreased by reducing the temporal feature resolution within the network, there is no setting that is optimal for all input clips. In this work, we therefore introduce a differentiable Similarity Guided Sampling (SGS) module, which can be plugged into any existing 3D CNN architecture. SGS empo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 26 publications
0
1
0
Order By: Relevance
“…For example, dense sampling captures accurate short-term dynamics and sparse sampling [27] extracts long-term dependencies. Several recent studies also investigate adaptive sampling for higher efficiency [3,13]. Spatio-Temporal Action Detection.…”
Section: Related Workmentioning
confidence: 99%
“…For example, dense sampling captures accurate short-term dynamics and sparse sampling [27] extracts long-term dependencies. Several recent studies also investigate adaptive sampling for higher efficiency [3,13]. Spatio-Temporal Action Detection.…”
Section: Related Workmentioning
confidence: 99%