“…First, CNNs are trained on visible, infrared, and fused images to acquire the requisite weightings for fusion [ 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 ]. Second, it leverages pre-trained neural network models to only extract features and obtain weight maps from the images, thereby achieving the fusion objective [ 30 , 31 , 32 , 33 ]; - Generative Adversarial Network (GAN)-based methods transform the task of integrating visible and infrared images into an adversarial process, characterized by the interplay between a generator and a discriminator. Their objective is to combine visible and infrared images through the generator, at the same time tasking the discriminator with evaluating the sufficiency of visible and infrared information within the fused image [ 34 , 35 , 36 , 37 , 38 , 39 , 40 ];
- Encoder-decoder-based networks consist of two main components: an encoder and a decoder.
…”