Real-time image processing and regions of interest extraction are crucial in the nonstandard welding process guided by structured light vision. However, due to the impact of detection speed, accuracy, and applicability, existing methods are difficult to apply directly. To address these issues, we have screened various improvement methods through experiments and provided a practical lightweight algorithm, YOLO-DGB, based on YOLOv5s, a current mainstream object detection algorithm. The proposed algorithm introduces depthwise separable convolution and Ghost modules into the backbone network to reduce the number of parameters and floating-point operations per second (FLOPs) in the detection process. To meet the accuracy requirements of the detection network, a bottleneck transformer is introduced after the spatial pyramid pooling fast, which improves the detection effect while ensuring a reasonable number of parameters and computation. To address the issue of insufficient datasets, we propose an improved DCGAN network to enhance the collected images. Compared with the original YOLOv5s network, our proposed algorithm reduces the number of parameters and FLOPs by 35.5% and 44.8%, respectively, while increasing the mAP of the model from 96.5% to 98.1%. Experimental results demonstrate that our algorithm can effectively meet the requirements of actual production processes.