Estimating sharp images from blurry observations is still a difficult task in the image processing research field. Previous works may produce deblurred images that lose details or contain artifacts. To deal with this problem, a feasible solution is to seek the help of additional images, such as the nearinfrared image and the flashlight image, etc. In this paper, we propose a fusion framework for image deblurring, called Guided Deblurring Fusion Network (GDFNet), to integrate the multi-modal information for better image deblurring performance. Unlike previous works that directly compute a deblurred image using paired multi-modal degraded and guidance images, GDFNet employs image fusion techniques to obtain a deblurred image. GDFNet can combine the advantages by fusing the pre-deblurred streams of single and guided image deblurring using convolutional neural network (CNN). We adopt a blur/residual image splitting strategy by fusing the residual images to enhance the representation ability of encoders and preserve details. We employ a 2-level coarse-to-fine reconstruction strategy to improve the fusion and deblurring performance by supervising its multi-scale output. Quantitative comparisons on multi-modal image datasets show that our GDFNet can recover correct structures and produce fewer artifacts while preserving more details. The peak signal-to-noise ratio (PSNR) of GDFNet (with MPR) evaluated on the blurry/flash dataset is at least 0.9 dB higher than the compared algorithms.INDEX TERMS Blind image deblurring, guided image deblurring, image deblurring, image fusion, multimodal image fusion.