In industrial manufacturing, metal surface defect detection often suffers from low detection accuracy, high leakage rates, and false detection rates. To address these issues, this paper proposes a novel model named DSL-YOLO for metal surface defect detection. First, we introduce the C2f_DWRB structure by integrating the DWRB module with C2f, enhancing the model’s ability to detect small and occluded targets and effectively extract sparse spatial features. Second, we design the SADown module to improve feature extraction in challenging tasks involving blurred images or very small objects. Finally, to further enhance the model’s capacity to extract multi-scale features and capture critical image information (such as edges, textures, and shapes) without significantly increasing memory usage and computational cost, we propose the LASPPF structure. Experimental results demonstrate that the improved model achieves significant performance gains on both the GC10-DET and NEU-DET datasets, with a mAP@0.5 increase of 4.2% and 2.6%, respectively. The improvements in detection accuracy highlight the model’s ability to address common challenges while maintaining efficiency and feasibility in metal surface defect detection, providing a valuable solution for industrial applications.