Considering the challenges of low multi‐object detection accuracy and difficulty in identifying small targets caused by challenging environmental conditions including irregular lighting patterns and ambient noise levels in the mining environment with autonomous electric locomotives. A new network model based on SOD−YOLOv5s−4L has been proposed to detect multi‐objects for autonomous electric locomotives in underground coal mines. Improvements have been applied in YOLOv5s to construct the SOD−YOLOv5s−4L model, by introducing the SIoU loss function to address the mismatch between real and predicted bounding box directions, facilitating the model to learn target position information more efficiently. This research introduces a decoupled head to enhance feature fusion and improve the positioning precision of the network model, enabling rapid capture of multi‐scale target features. Furthermore, the detection capability of the model has been increased by introducing the small target detection layer which is developed by increasing the number of detection layers from three to four. The experimental results on multiple object detection dataset show that the proposed model achieves significant improvement in mean average precision (mAP) of almost 98% for various types of targets and an average precision (AP) of nearly 99% for small targets on the other hand it achieves 5.19% (mAP) and 9.79% (AP) compared to the YOLOv5s model. Furthermore, comparative analysis with other models like YOLOv7 and YOLOv8 shows that the proposed model has superior performance in terms of object detection.