2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020
DOI: 10.1109/cvpr42600.2020.01428
|View full text |Cite
|
Sign up to set email alerts
|

On Translation Invariance in CNNs: Convolutional Layers Can Exploit Absolute Spatial Location

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
79
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
3
2

Relationship

1
9

Authors

Journals

citations
Cited by 118 publications
(81 citation statements)
references
References 48 publications
2
79
0
Order By: Relevance
“…Rotation and shift are two commonly used data augmentation manners in vision tasks, and these operations should not alter the final results of the model. In other words, we expect translation-invariance [24] in those tasks. However, the absolute positional encoding used in previous transformers, initially designed to leverage the order of tokens, damages such invariance because it adds unique positional encoding to each patch [6].…”
Section: Cmt Blockmentioning
confidence: 92%
“…Rotation and shift are two commonly used data augmentation manners in vision tasks, and these operations should not alter the final results of the model. In other words, we expect translation-invariance [24] in those tasks. However, the absolute positional encoding used in previous transformers, initially designed to leverage the order of tokens, damages such invariance because it adds unique positional encoding to each patch [6].…”
Section: Cmt Blockmentioning
confidence: 92%
“…Position Information Encoded by Neural Network: Recent studies [15,19] have shown that convolutional neural networks can encode the position information of the input image, mainly caused by the padding operation. There also have been a series of works explicitly leverage the position information for devising different architectures (e.g., Transformer network [38], local relation network [13] and generative adversarial network [43]) or enhancing the performance for various computer vision tasks, such as instance segmentation [30], refer expression segmentation [26], etc.…”
Section: Related Workmentioning
confidence: 99%
“…Anti-aliasing in CNNs [55] increases robustness and accuracy. Reducing border effects in CNNs [19] improves translation equivariance and data efficiency. In this paper, we build on these successes and show the benefits of reducing spectral leakage in CNNs.…”
Section: Signal Processing Benefits For Cnnsmentioning
confidence: 99%