To obtain fused images with excellent contrast, distinct target edges, and well-preserved details, we propose an adaptive image fusion network called the adjacent feature shuffle-fusion network (AFSFusion). The proposed network adopts a UNet-like architecture and incorporates key refinements to enhance network architecture and loss functions. Regarding the network architecture, the proposed two-branch adjacent feature fusion module, called AFSF, expands the number of channels to fuse the feature channels of several adjacent convolutional layers in the first half of the AFSFusion, enhancing its ability to extract, transmit, and modulate feature information. We replace the original rectified linear unit (ReLU) with leaky ReLU to alleviate the problem of gradient disappearance and add a channel shuffling operation at the end of AFSF to facilitate information interaction capability between features. Concerning loss functions, we propose an adaptive weight adjustment (AWA) strategy to assign weight values to the corresponding pixels of the infrared (IR) and visible images in the fused images, according to the VGG16 gradient feature response of the IR and visible images. This strategy efficiently handles different scene contents. After normalization, the weight values are used as weighting coefficients for the two sets of images. The weighting coefficients are applied to three loss items simultaneously: mean square error (MSE), structural similarity (SSIM), and total variation (TV), resulting in clearer objects and richer texture detail in the fused images. We conducted a series of experiments on several benchmark databases, and the results demonstrate the effectiveness of the proposed network architecture and the superiority of the proposed network compared to other state-of-the-art fusion methods. It ranks first in several objective metrics, showing the best performance and exhibiting sharper and richer edges of specific targets, which is more in line with human visual perception. The remarkable enhancement in performance is ascribed to the proposed AFSF module and AWA strategy, enabling balanced feature extraction, fusion, and modulation of image features throughout the process.
Image denoising poses a significant challenge in computer vision due to the high-level visual task’s dependency on image quality. Several advanced denoising models have been proposed in recent decades. Recently, deep image prior (DIP), using a particular network structure and a noisy image to achieve denoising, has provided a novel image denoising method. However, the denoising performance of the DIP model still lags behind that of mainstream denoising models. To improve the performance of the DIP denoising model, we propose a TripleDIP model with internal and external mixed images priors for image denoising. The TripleDIP comprises of three branches: one for content learning and two for independent noise learning. We firstly use a Transformer-based supervised model (i.e., Restormer) to obtain a pre-denoised image (used as external prior) from a given noisy image, and then take the noisy image and the pre-denoised image as the first and second target image, respectively, to perform the denoising process under the designed loss function. We add constraints between two-branch noise learning and content learning, allowing the TripleDIP to employ external prior while enhancing independent noise learning stability. Moreover, the automatic stop criterion we proposed prevents the model from overfitting the noisy image and improves the execution efficiency. The experimental results demonstrate that TripleDIP outperforms the original DIP by an average of 2.79 dB and outperforms classical unsupervised methods such as N2V by an average of 2.68 dB and the latest supervised models such as SwinIR and Restormer by an average of 0.63 dB and 0.59 dB on the Set12 dataset. This can mainly be attributed to the fact that two-branch noise learning can obtain more stable noise while constraining the content learning branch’s optimization process. Our proposed TripleDIP significantly enhances DIP denoising performance and has broad application potential in scenarios with insufficient training datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.