Gesture recognition is an important part of human-robot interaction. In order to achieve fast and stable gesture recognition in real time without distance restrictions, this paper presents an improved threshold segmentation method. The improved method combines the depth information and color information of a target scene with hand position by the spatial hierarchical scanning method; the ROI in the scene is thus extracted by the local neighbor method. In this way, the hand can be identified quickly and accurately in complex scenes and different distances. Furthermore, the convex hull detection algorithm is used to identify the positioning of fingertips in ROI, so that the fingertips can be identified and located accurately. The experimental results show that the hand position can be obtained quickly and accurately in the complex background by using the improved method, the real-time recognition distance interval can be reached by 0.5 m to 2.0 m, and the fingertip detection rates can be reached 98.5% in average. Moreover, the gesture recognition rates are more than 96% by the convex hull detection algorithm. It can be thus concluded that the proposed method achieves good performance of hand detection and positioning at different distances.