The traditional detection methods of steel surface defects have some problems, such as a lack of feature extraction ability, sluggish detection speed, and subpar detection performance. In this paper, a YOLOv8-based DDI-YOLO model is suggested for effective steel surface defect detection. First, on the Backbone network, the extended residual module (DWR) is fused with the C2f module to obtain C2f_DWR, and the two-step approach is used to carry out the effective extraction of multiscale contextual information, and then fusing feature maps formed from the multiscale receptive fields to enhance the capacity for feature extraction. Also based on the above, an extended heavy parameter module (DRB) is added to the structure of C2f_DWR to make up for the lack of C2f’s ability to capture small-scale pattern defects between training to enhance the training fluency of the model. Finally, the Inner-IoU loss function is employed to enhance the regression accuracy and training speed of the model. The experimental results show that the detection of DDI-YOLO on the NEU-DET dataset improves the mAP by 2.4%, the accuracy by 3.3%, and the FPS by 59 frames/s compared with the original YOLOv8n.Therefore, this paper’s proposed model has a superior mAP, accuracy, and FPS in identifying surface defects in steel.