Virtual Reality (VR) provides users with a sensory experience that is close to reality, creating a sense of interaction. It is widely used, and the gesture recognition in VR also has a great effect. Gesture recognition enriches VR using experience and promotes a more direct and natural interaction. Gesture recognition usually employs sensors to collect data from users and machine leaning algorithms to interpret and respond to human activities. Complex gestures need more complex algorithms and more rigorous operations. The reason is that complex gestures mean larger quantity of data. If data is larger, the harder to get robust and effective datasets. Then, features can also become difficult to extract, contributing to misrecognition or unrecognizable. Though machine leaning algorithms are widely used in gesture recognition, there are still some important challenges need to be addressed, like lack of standardization and limitations of availability of diverse and large datasets. However, VR, gesture recognition and machine leaning algorithms all have excellent prospect, because they are in line with the development of the Times and show the progress of science and technology. This paper not only focuses on their advantages but also does not ignore their shortcomings, and looks at them comprehensively.