2023
DOI: 10.1609/aaai.v37i1.25111
|View full text |Cite
|
Sign up to set email alerts
|

Hybrid CNN-Transformer Feature Fusion for Single Image Deraining

Abstract: Since rain streaks exhibit diverse geometric appearances and irregular overlapped phenomena, these complex characteristics challenge the design of an effective single image deraining model. To this end, rich local-global information representations are increasingly indispensable for better satisfying rain removal. In this paper, we propose a lightweight Hybrid CNN-Transformer Feature Fusion Network (dubbed as HCT-FFN) in a stage-by-stage progressive manner, which can harmonize these two architectures to help i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 27 publications
(2 citation statements)
references
References 42 publications
0
2
0
Order By: Relevance
“…Due to its significant performance, transformer has also been introduced into low-level vision tasks. [30][31][32][33][34] SwinIR 14 designed a residual network by Swin transformer blocks and achieve state-of-the-art results in image restoration. Wang et al 35 presented an efficient transformer-based network for image restoration named Uformer.…”
Section: Vision Transformermentioning
confidence: 99%
“…Due to its significant performance, transformer has also been introduced into low-level vision tasks. [30][31][32][33][34] SwinIR 14 designed a residual network by Swin transformer blocks and achieve state-of-the-art results in image restoration. Wang et al 35 presented an efficient transformer-based network for image restoration named Uformer.…”
Section: Vision Transformermentioning
confidence: 99%
“…We have witnessed the rapid advancement of CNNs in image dehazing and deraining (Li et al, 2020;Zhou et al, 2021;Chen et al, 2022). However, due to the inherent characteristics of convolution operations, specifically the use of local receptive fields and the independence of input content, CNNs struggle to effectively model spatially-long feature dependencies of images (Chen et al, 2023c).…”
mentioning
confidence: 99%