2021
DOI: 10.48550/arxiv.2107.00645
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Global Filter Networks for Image Classification

Abstract: Recent advances in self-attention and pure multi-layer perceptrons (MLP) models for vision have shown great potential in achieving promising performance with fewer inductive biases. These models are generally based on learning interaction among spatial locations from raw data. The complexity of self-attention and MLP grows quadratically as the image size increases, which makes these models hard to scale up when high-resolution features are required. In this paper, we present the Global Filter Network (GFNet), … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
25
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 16 publications
(25 citation statements)
references
References 31 publications
0
25
0
Order By: Relevance
“…Recent research explorations on Vision Transformers (ViT) [10,23,56] have exemplified their great potential as alternatives to the go-to CNN models. The elegance of ViT [23] has also motivated similar model designs with simpler global operators such as MLP-Mixer [85], gMLP [53], GFNet [74], and FNet [43], to name a few. Despite successful applications to many high-level tasks [4,23,56,83,87,100], the efficacy of these global models on low-level enhancement and restoration problems has not been studied extensively.…”
Section: Enhancementmentioning
confidence: 99%
See 3 more Smart Citations
“…Recent research explorations on Vision Transformers (ViT) [10,23,56] have exemplified their great potential as alternatives to the go-to CNN models. The elegance of ViT [23] has also motivated similar model designs with simpler global operators such as MLP-Mixer [85], gMLP [53], GFNet [74], and FNet [43], to name a few. Despite successful applications to many high-level tasks [4,23,56,83,87,100], the efficacy of these global models on low-level enhancement and restoration problems has not been studied extensively.…”
Section: Enhancementmentioning
confidence: 99%
“…Very recent techniques such as FNet [43] and GFNet [74] demonstrate the simple Fourier Transform can be used as a competitive alternative to either self-attention or MLPs.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…More recently, alternatives have been introduced for self-attention that relax the graph assumption for efficient mixing. Instead, they leverage the geometric structures using Fourier transform Rao et al (2021); Lee-Thorp et al (2021). For instance, the Global Filter Networks (GFN) proposes depthwise global convolution for token mixing that enjoys an efficient implementation in the Fourier domain Rao et al (2021).…”
Section: Introductionmentioning
confidence: 99%