In order to make the toy robot more entertaining, interesting, and intelligent, a voice recognition sensor and voice control system in the intelligent toy robot system are proposed. The system builds an overall system architecture including a client and a server. Through the camera calibration and data transmission module of the client, it collects images and calculates the internal and external parameters of the camera and transmits the image and external parameter data to the server. With the images and external parameter data transmitted from the terminal, a background image is constructed and the camera position and angle are updated in real time to complete the fusion of virtual and real scenes. Through the motion control part of the user interaction module, hearing-impaired children can control the movement and rotation of smart toys. The experimental results show that the system has high communication synchronization and stability and can realize high-precision control of smart toys, and the average frame rate can reach 30.97 f/s. The beneficial effect of the system is that it has various functions, has the effect of speech recognition, and is highly interesting.