In this paper, we use data from the Microsoft Kinect sensor that processes the captured image of a person using and extracting the joints information on every frame. Then, we propose the creation of an image derived from all the sequential frames of a gesture the movement, which facilitates training in a convolutional neural network. We trained a CNN using two strategies: combined training and individual training. The strategies were experimented in the convolutional neural network (CNN) using the MSRC-12 dataset, obtaining an accuracy rate of 86.67% in combined training and 90.78% of accuracy rate in the individual training. Then, the trained neural network was used to classify data obtained from Kinect with a person, obtaining an accuracy rate of 72.08% in combined training and 81.25% in individualized training. Finally, we use the system to send commands to a mobile robot in order to control it.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.