Smart cultural tourism is the development trend of the future tourism industry. Virtual reality is an important tool to realize smart tourism. The reality of virtual reality mainly comes from human-computer interaction, which is closely related to human action recognition technology. Therefore, the research takes human action recognition as the research direction, uses a self-organizing mapping network (SOM) neural network to extract the key frame of action video, combines it with multi-feature vector method to recognize human action, and compares the recognition rate and user satisfaction of different recognition methods. The results show that the recognition rate of multi-feature voting human action recognition algorithm based on SOM neural network is 93.68% on UT-Kinect action, 59.06% on MSRDailyActivity3D, and the overall action recognition time is only 3.59 s. Within six months, the total profit of human-computer interactive virtual reality tourism project with SOM neural network multi-eigenvector as the core algorithm reached 422,000 yuan, and 88% of users expressed satisfaction after use. It shows that the proposed method has a good recognition rate and can give users effective feedback in time. It is hoped that this research has a certain reference value in promoting the development of human motion recognition technology.