The rapid development of deep neural networks has attracted significant attention in the infrared and visible image fusion field. However, most existing fusion models have many parameters and consume high computational and spatial resources. This paper proposes a fast and efficient recursive fusion neural network model to solve this complex problem that few people have touched. Specifically, we designed an attention module combining a traditional fusion knowledge prior with channel attention to extract modal-specific features efficiently. We used a shared attention layer to perform the early fusion of modal-shared features. Adopting parallel dilated convolution layers further reduces the network’s parameter count. Our network is trained recursively, featuring minimal model parameters, and requires only a few training batches to achieve excellent fusion results. This significantly reduces the consumption of time, space, and computational resources during model training. We compared our method with nine SOTA methods on three public datasets, demonstrating our method’s efficient training feature and good fusion results.