Automating the casting sector heavily relies on pivotal technology for object detection in pouring robots. A sophisticated algorithm designed to identify and locate target pouring holes in intricate casting workshops is crucial for advancing the intelligence of the casting process. However, the workshop environment for pouring is generally challenging, with uneven lighting, varying sizes of pouring holes, and significant occlusion in the target area, all impacting the accuracy of target detection tasks. To overcome these challenges, this paper proposes enhancing the YOLOv8s algorithm for object detection in pouring robots. Firstly, to address the issue of different scales in pouring holes, a Multi-Scale Residual Channel and Spatial Information Fusion Module (MRCS) is designed to aggregate channel and spatial information, thereby enhancing the feature extraction capability of the model. The proposed enhancement is validated on the Pascal VOC dataset. Secondly, a SimAM attention mechanism is added at the end of the backbone network to focus the object detection network more on the positional region of the pouring hole. Importantly, this addition does not introduce extra parameters or computational burden to the model. Finally, in the detection part of the model, the detection head from the RT-DETR model is introduced. This combination of real-time detection capability from YOLO and deep feature extraction capability from RT-DETR enhances the detection accuracy of the model while ensuring real-time performance. Experimental results on the updated pouring hole dataset reveal that, with only a slight increase in parameters, the proposed model achieves a 2.5% and 3.5% improvement in mAP@0.5 and F1-Score, respectively, compared to the baseline algorithm YOLOv8s. Precision (P) is enhanced by 1.8%, recall (R) by 3.5%, and PFS reaches 110, meeting the requirements for real-time pouring in pouring robots.