2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW) 2019
DOI: 10.1109/iccvw.2019.00133
|View full text |Cite
|
Sign up to set email alerts
|

Non-Discriminative Data or Weak Model? On the Relative Importance of Data and Model Resolution

Abstract: We explore the question of how the resolution of the input image ("input resolution") affects the performance of a neural network when compared to the resolution of the hidden layers ("internal resolution"). Adjusting these characteristics is frequently used as a hyperparameter providing a trade-off between model performance and accuracy. An intuitive interpretation is that the reduced information content in the low-resolution input causes decay in the accuracy. In this paper, we show that up to a point, the i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
25
0

Year Published

2021
2021
2025
2025

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 25 publications
(25 citation statements)
references
References 10 publications
0
25
0
Order By: Relevance
“…A study predating vision transformers investigate isotropic (or "isometric") MobileNets (Sandler et al, 2019), and even implements patch embeddings under another name. Their architecture simply repeats an isotropic MobileNetv3 block.…”
Section: R Wmentioning
confidence: 99%
See 1 more Smart Citation
“…A study predating vision transformers investigate isotropic (or "isometric") MobileNets (Sandler et al, 2019), and even implements patch embeddings under another name. Their architecture simply repeats an isotropic MobileNetv3 block.…”
Section: R Wmentioning
confidence: 99%
“…In Figure 4 and 5, we visualize the (complete) weights of the patch embedding layers of a ConvMixer-1536/20 with p = 14 and a ConvMixer-768/32 with p = 7, respectively. Much like Sandler et al (2019), the layer consists of Gabor-like filters as well as "colorful globs" or rough edge detectors.…”
Section: W Vmentioning
confidence: 99%
“…the input image resolution As shown in Figure 8, even an extremely efficient lightweight model (MV3-small (0.35x128), computation overhead of 13.8M Madds) can lead to a performance boost from 77.7% to 78.9% (+1.2%). This experiment shows that resolution and multiplier can have an equivalent effect as reported in [21] and a lightweight model with a smaller computation overhead can bring most of the performance gain. Thus it might be more beneficial to scale the model multiplier and resolution coordinately [25].…”
Section: D2 Detailed Experiments For Number Of Bases Inmentioning
confidence: 58%
“…As a downside, these pyramidal approaches dramatically reduce the resolution of the last layers, and hence the quality of their attention maps, making their predictions harder to interpret. Another shortcoming is their relatively high memory usage [50].…”
Section: Related Workmentioning
confidence: 99%