Gesture recognition is used for many practical applications such as human-robot interaction, medical rehabilitation and sign language. In this paper, we apply a hybrid generative-discriminative approach by using the Fisher Vector to improve the recognition performance. The strategy is to merge the generative approach of Hidden Markov Model dealing with spatio-temporal motion data with the discriminative approach of Support Vector Machine focusing on the classification task. The motion segments are encoded into HMMs, and each segment is converted to FV, whose elements can be obtained as the derivative of the probability of the segment being generated by the HMMs with respect to their parameters. SVM is subsequently trained by the FVs. An input gesture can be classified to corresponding gesture category by SVM. In the experiments, we test our approach by comparing three HMM chain models and four categorization methods on dataset provided by the ChaLearn Looking at People Challenge 2014 (LAP 2014). The results show that similar gesture patterns are clustered closely in several categories. Our approach based left-to-right HMMs outperforms other gesture recognition methods. More specifically, the hybrid generative-discriminative approach overcomes the standard HMM approach and the generative kernel approach overcomes the generative embedding approach. For these results, our approach is effective to improve the recognition performance.The authors are with the
In this paper, we propose a motion model that focuses on the discriminative parts of the human body related to target motions to classify human motions into specific categories, and apply this model to multi-class daily motion classifications. We extend this model to a motion recognition system which generates multiple sentences associated with human motions. The motion model is evaluated with the following four datasets acquired by a Kinect sensor or multiple infrared cameras in a motion capture studio: UCFkinect; UT-kinect; HDM05-mocap; and YNL-mocap. We also evaluate the sentences generated from the dataset of motion and language pairs. The experimental results indicate that the motion model improves classification accuracy and our approach is better than other state-of-the-art methods for specific datasets, including human-object interactions with variations in the duration of motions, such as daily human motions. We achieve a classification rate of 81.1% for multi-class daily motion classifications in a non cross-subject setting. Additionally, the sentences generated by the motion recognition system are semantically and syntactically appropriate for the description of the target motion, which may lead to human-robot interaction using natural language.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations鈥揷itations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.