2021
DOI: 10.1109/tip.2021.3125263
|View full text |Cite
|
Sign up to set email alerts
|

Spatially Adaptive Feature Refinement for Efficient Inference

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
17
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
3

Relationship

5
4

Authors

Journals

citations
Cited by 21 publications
(18 citation statements)
references
References 38 publications
0
17
0
Order By: Relevance
“…In contrast, our deformable attention takes a powerful and yet simple design to learn a set of global keys shared among visual tokens, and can be adopted as a general backbone for various vision tasks. Our method can also be viewed as a spatial adaptive mechanism, which has been proved effective in various works [16,38].…”
Section: Related Workmentioning
confidence: 99%
“…In contrast, our deformable attention takes a powerful and yet simple design to learn a set of global keys shared among visual tokens, and can be adopted as a general backbone for various vision tasks. Our method can also be viewed as a spatial adaptive mechanism, which has been proved effective in various works [16,38].…”
Section: Related Workmentioning
confidence: 99%
“…Dynamic inference results. We apply our training strategy on MSDNet with 5 and 7 exits and compare with three groups of competitive baseline methods: classic networks (ResNet [13], DenseNet [18]), pruning-based approaches (Sparse Structure Selection (SSS) [19], Transformable Architecture Search (TAS) [6]), and dynamic networks (Shallow-Deep Networks (SDN) [23], Dynamic Convolutions (DynConv) [44], and Spatially Adaptive Feature Refinement (SAR) [12]).…”
Section: Cifar Resultsmentioning
confidence: 99%
“…Improving the inference efficiency of deep learning has become a research trend. Popular solutions include lightweight architecture design [16,50], network pruning [28,29,35,48], weight quantization [20,10], and dynamic neural networks [11,37,17,47,32,1,46,44,12]. Dynamic networks have attracted considerable research interests due to their favorable efficiency and representation power [11].…”
Section: Introductionmentioning
confidence: 99%
“…Visual grounding (VG) task [13,24,40,65] has achieved great progress in recent years, with the advances in both computer vision [16,20,21,25,26,46,56,57,59] and natural language processing [4,14,41,50,53]. It aims to localize the objects referred by natural language queries, which is essential for various vision-language tasks, e.g., visual question answering [2] and visual commonsense reasoning [67].…”
Section: Introductionmentioning
confidence: 99%