Recently, with the rise and progress of convolutional neural networks (CNNs), CNN-based remote sensing image super-resolution (RSSR) methods have gained considerable advancement and showed great power for image reconstruction tasks. However, most of these methods cannot handle well the enormous number of objects with different scales contained in remote sensing images and thus limits super-resolution performance. To address these issues, we propose a Multi-Scale Fast Fourier Transform (FFT) based Attention Network (MSFFTAN) which employs a multi-input U-shape structure as backbone for accurate remote sensing image super-resolution. Specifically, we carefully design an FFT-based residual block consisting of an image domain branch and a Fourier domain branch to extract local details and global structures simultaneously. In addition, a Local-Global Channel Attention Block (LGCAB) is developed to further enhance the reconstruction ability of small targets. Finally, we present a Branch Gated Selective Block (BGSB) to adaptively explore and aggregate features from multiple scales and depths. Extensive experiments on two publicly datasets have demonstrated the superiority of MSFFTAN over the state-of-theart (SOAT) approaches in aspects of both quantitative metrics and visual quality. The peak signal-to-noise ratio of our network is 1.5 dB higher than the SOAT method on the UCMerced LandUse with downscaling factor 2.
Single Image Super‐Resolution algorithms have made enormous progress in recent years. However, many previous Convolution Neural Network (CNN) based Super‐Resolution algorithms only stack uniform convolution layers of fixed kernel size, and frequently ignore inherent multi‐scale properties of the images, resulting in unsatisfactory reconstruction results. Here, a multi‐feature fusion attention network (MFFAN) is proposed for capturing information at diverse scales. MFFAN is composed of multiple efficient sparse residual group (ESRG) modules. Several multi‐scale feature fusion blocks (MSFFB) are constructed using a cascade manner in each ESRG module and it is capable of exploiting various cross scales information. Subsequently, a local‐global spatial attention block (LGSAB) is inserted at the tail of the ESRG module for further improving the interaction of inter‐pixel, which strengths essential features and suppresses irrelevant information. Additionally, owing to the fact that only feeding final output into the reconstruction layer has exacerbated the long‐range dependency problems, an enhanced hierarchy feature fusion block (EHFFB) is designed to fuse low‐level information and high‐level semantic information. Experiment results indicate that the proposed MFFAN is competitive in comparison to several state‐of‐the‐art algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.