With the continuous development of computer artificial intelligence technology, various applications based on artificial intelligence emerge in an endless stream, among which video image recognition technology is the most widely used in life. This article starts from the process of image recognition, based on the composite characteristics of artificial intelligence and video images, to discuss human gesture recognition technology.This article uses the feature extraction algorithm for image composite feature extraction as a method, and conducts human body movement collection experiments, analyzes the database and The gesture recognition step. This paper mainly introduces the extraction method of image composite features and the basic requirements of gesture recognition, and through the algorithm calculation of feature extraction, the function of human gesture recognition video and image composite features is completed, and the human action collection experiment is carried out to confirm. The results of images and data show the advantages of the algorithm support used in this article. We will Dmti. MsHOG is compared with other methods in the three subsets. In terms of the accuracy of all tests, our method performs better than other methods. The results show that the MSHOG (Multi-scale Histogram of Oriented Gradients) descriptor can represent the unique characteristics of human behavior, reflecting the effectiveness of our proposed method. In particular, this method achieved 100% recognition accuracy in Test, with an average recognition accuracy of 94.91%, which is significantly better than existing methods.