In the realm of low-resolution (LR) to high-resolution (HR) image reconstruction, denoising diffusion probabilistic models (DDPMs) are recognized for their superior perceptual quality over other generative models, attributed to their adept handling of various degradation factors in LR images, such as noise and blur. However, DDPMs predominantly focus on a single modality in the super-resolution (SR) image reconstruction from LR images, thus overlooking the rich potential information in multimodal data. This lack of integration and comprehensive processing of multimodal data can impede the full utilization of the complementary characteristics of different data types, limiting their effectiveness across a broad range of applications. Moreover, DDPMs require thousands of evaluations to reconstruct high-quality SR images, which significantly impacts their efficiency. In response to these challenges, a novel multimodal DDPM based on structured knowledge distillation (MKDDPM) is introduced. This approach features a multimodal-based DDPM that effectively leverages sparse prior information from another modality, integrated into the MKDDPM network architecture to optimize the solution space and detail features of the reconstructed image. Furthermore, a structured knowledge distillation method is proposed, leveraging a well-trained DDPM and iteratively learning a new DDPM, with each iteration requiring only half the original sampling steps. This method significantly reduces the number of model sampling steps without compromising on sampling quality. Experimental results demonstrate that MKDDPM, even with a substantially reduced number of diffusion steps, still achieves superior performance, providing a novel solution for single-image SR tasks.