A convenient and effective binocular vision system is set up. Gesture information can be accurately extract from the complex environment with the system. The template calibration method is used to calibrate the binocular camera and the parameters of the camera are accurately obtained. In the phase of stereo matching, the BM algorithm is used to quickly and accurately match the images of the left and right cameras to get the parallax of the measured gesture. Combined with triangulation principle, resulting in a more dense depth map. Finally, the depth information is remapped to the original color image to realize three-dimensional reconstruction and three-dimensional cloud image generation. According to the cloud image information, it can be judged that the binocular vision system can effectively segment the gesture from the complex background.