For a long time, previous studies focused on convolutional neural network depth and width to improve accuracy in image super-resolution tasks. Although overall performance has grown as going deeper and going wider over time, until recently, these two elements start to show saturating accuracy and we meet a marginal utility. To address this problem, model efficiency is improved successfully by introducing two effective strategies, multi-cardinality, and spatial attention, to image super-resolution from high-level vision tasks. We propose a novel and efficient architecture aggregated residual attention network (ARAN) and set a new state-of-the-art model efficiency. According to the multi-cardinality strategy, we use group convolutions in each basic module. Moreover, we apply spatial attention blocks as gate units to capture detailed information in input images, which can be the counterparts of their basic modules. Extra representation ability is demonstrated compared with both the same sized enhanced deep super-resolution network (EDSR) baseline model and currently the-state-of-the-art cascading residual network (CARN). The experiments suggest the effectiveness of these two missing strategies previously. Especially, in the aspect of model efficiency, ARAN exceeds almost all the medium sized models currently. Code and pre-trained models are publicly available on the github: https://github.com/Xingrun-Xing/ARAN.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.