2022
DOI: 10.1049/ipr2.12668
|View full text |Cite
|
Sign up to set email alerts
|

An unsupervised multi‐focus image fusion method based on Transformer and U‐Net

Abstract: This work presents a multi‐focus image fusion method based on Transformer and U‐Net with an unsupervised training fashion. In this work, the authors introduce Transformer into image fusion because it has great ability to capture the global dependencies and low‐frequency features. In image processing, convolutional neural network (CNN) has good performance of detailed feature extraction but a weakness for global feature extraction, and Transformer has limited power in local or detailed information extraction bu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
8

Relationship

2
6

Authors

Journals

citations
Cited by 16 publications
(10 citation statements)
references
References 59 publications
0
5
0
Order By: Relevance
“…The transformer model has been used in many machine translation projects with good results in recent years [22]. Transformers do not require RNNs to perform their decoding task; they use only the attention mechanism.…”
Section: Related Workmentioning
confidence: 99%
“…The transformer model has been used in many machine translation projects with good results in recent years [22]. Transformers do not require RNNs to perform their decoding task; they use only the attention mechanism.…”
Section: Related Workmentioning
confidence: 99%
“…Inspired by non-local (Wang et al, 2018 ) approaches, some methods (Chen et al, 2016 ; Liu et al, 2017 ; Ding et al, 2018 ; Li et al, 2019 ; Hou et al, 2020 ; Pal et al, 2022 ) use attentional mechanisms to establish connections between image contexts. Transformer architectures also achieve good results in semantic segmentation, focusing on multi-scale feature fusion (Zhang et al, 2020 ; Chen et al, 2021 ; Wang et al, 2021 ; Xie et al, 2021 ; Jin et al, 2022a , b , c , 2023 ), and contextual feature aggregation (Liu et al, 2021 ; Strudel et al, 2021 ; Yan et al, 2022 ). For example, SETR (Zheng et al, 2021 ) uses the transformer framework to serialize images to achieve a fully attention-based feature representation encoder.…”
Section: Related Workmentioning
confidence: 99%
“…Önerilen Çalışma(Gradyan) MI 𝑄 𝑄 / STD MST-SR [32] 5,15 0,76 0,74 57,42 FusionGAN [33] 3,5 0,5 0, 37 48,35 ECNN [16] 6,15 0,8072 0,75 57,51 CNN [14] 5,96 0,8084 0,7618 57,46 QB [34] 5,55 0,7827 0,7446 57,5385 IFCNN [21] 4,8797 0,7292 0,7296 57,5502 U-NET [35] 6,1358 0,5642 0,6382 X U2Fusion [36] 7,82 X 0,75 57,52 Bouzos vd. [37] 7,36 0,7557 0,7143 X Li vd.…”
Section: Yang Vd Metriği (𝑄 )mentioning
confidence: 99%