This study focused on utilizing the Kinect depth sensor to track double-hand gestures and control a real-time robotic arm. The control system is mainly composed of the microprocessor, a color camera, the depth sensor, and the robotic arm. The Kinect depth sensor was used to take photos of the human body to analyze the skeleton of a human body and obtain the relevant information. Such information was used to identify the gestures of the left hand and the left palm of the user. The gesture of left hand was used as an input command device. The gesture of the right hand was used for imitation movement teaching of robotic arm. From the depth sensor, the real-time images of the human body and the deep information of each joint were collected and converted to the relative positions of the robotic arm. Combining forward kinematics and inverse kinematics and D-H link, the gesture information of the right hand was calculated, which was converted via coordinates into each angle of the motor of the robotic arm. From the color camera, when the left palm was not detected, the user could simply use the right hand to control the action and movement of the real-time robotic arm. When the left palm was detected and 5 fingertips were identified, it meant the start of recording the real-time imitation movement of the robotic arm by the right hand. When 0 fingertip was identified, it meant the stoppage of the above recording. When 2 fingertips were identified, the user could not only control the real-time robotic arm but also repeat the recorded actions.