As smart grids continue to evolve, inspection robots based on vision and inertial measurement unit (IMU) fusion systems (VIO) have become increasingly critical for ensuring safe operation and maintenance within substations.The implementation of VIO requires precise alignment of internal and external parameters, initialization, and synchronization of the IMU and camera sensor clocks.In this paper, a joint camera-IMU initialization algorithm is proposed that enables online self-calibration and time synchronization of camera and IMU, it extends the open-source VINS-Mono system to utilize depth data in the VIO phase. The algorithm proposed in this paper is evaluated through experiments performed on the EuRoC dataset. The results are compared with vision-only systems, VINS-Mono, and VI-ORB systems, demonstrating the potential for improved system performance compared to other algorithms. Moreover, a mobile robot platform equipped with vision and IMU sensors is built to verify the utility and effectiveness of the algorithm in real-world scenarios.