Robot technology shows broad application prospects in rehabilitation medicine, especially in hand rehabilitation. Hand function plays an important role that cannot be ignored in daily life, and its key properties are reflected in multiple levels. From basic life skills to occupational needs, healthy hand function is indispensable. Hand function is the foundation for performing daily life skills, including activities such as self-care, eating, dressing, and grooming. The ability to freely use hand functions is directly related to an individual's quality of life and independence. This study proposed a gesture recognition algorithm by fusing Ycbcr color space and convolutional neural network. The method first converted gesture images and recognizes them through the converted images. Then, a hand function rehabilitation training robot based on Ycbcr and CNN was designed, which provided rehabilitation treatment for patients with impaired hand function. These experiments confirmed that when the data set size was 500, the signal-to-noise ratios of YOLOV3, YOLOV3-SPP, YOLOV4, and hybrid algorithms were 27.5dB, 32.7dB, 34.8dB, and 41.2dB, respectively. Their inter-section of union values were 0.53, 0.64, 0.77, and 0.89, respectively. These results confirm that the proposed hybrid algorithm model has good model performance in various algorithms.