The emergence of convolutional neural network (CNN) and transformer has recently facilitated significant advances in image super-resolution (SR) tasks. However, these networks commonly construct complex structures, having huge model parameters and high computational costs, to boost reconstruction performance. In addition, they do not consider the structural prior well, which is not conducive to high-quality image reconstruction. In this work, we devise a lightweight interactive feature inference network (IFIN), complementing the strengths of CNN and Transformer, for effective image SR reconstruction. Specifically, the interactive feature aggregation module (IFAM), implemented by structure-aware attention block (SAAB), Swin Transformer block (SWTB), and enhanced spatial adaptive block (ESAB), serves as the network backbone, progressively extracts more dedicated features to facilitate the reconstruction of high-frequency details in the image. SAAB adaptively recalibrates local salient structural information, and SWTB effectively captures rich global information. Further, ESAB synergetically complements local and global priors to ensure the consistent fusion of diverse features, achieving high-quality reconstruction of images. Comprehensive experiments reveal that our proposed networks attain state-of-the-art reconstruction accuracy on benchmark datasets while maintaining low computational demands. Our code and results are available at: https://github.com/wwaannggllii/IFIN.