With the continuous changes in socio-economic needs, traditional methods of power facility inspection can no longer meet practical needs due to their low efficiency and lack of scalability. In response to this challenge, this study delves into the integrated motion control technology of inspection robots equipped with gimbal mechanisms, aiming to improve the convenience and efficiency of dynamic data collection. A customized multi-source heterogeneous visual detection and recognition model based on the YOLOv3 framework has been proposed, and simultaneously using path aggregation networks to enhance information processing capacity by fusing multi-scale features. Experimental analysis shows that as the robot's movement speed increases, the error rate correspondingly increases, indicating the direction of optimization. In the target recognition experiment, the proposed model achieved an average accuracy of 94.26% in visible light images and 68.05% in infrared images. In addition, Sub_ The YOLO algorithm demonstrates a fast detection speed of 30 frames per second, with an average accuracy of over 80%, marking an important progress in real-time object detection applications. In the linear motion test, the relative error of the robot's motion accuracy was 0.33% at a speed of 500 millimeters per second. However, when the speed was increased to 1200 millimeters per second, the error increased to 2.45%, indicating a significant increase in slip. This indicates that the linear motion accuracy of the robot is acceptable at low to medium speeds, but the accuracy decreases significantly at high speeds.Overall, the research results confirm the synergistic effect of integrated motion control between inspection robots and gimbals, as well as Sub_ The superiority of YOLO in target recognition has improved the ability to use wheeled robots for electrical inspections, bringing substantial technological progress to the field of autonomous inspection.