We organized a challenge on gesture recognition: http://gesture.chalearn.org. We made available a large database of 50,000 hand and arm gestures videorecorded with a Kinect T M camera providing both RGB and depth images. We used the Kaggle platform to automate submissions and entry evaluation. The focus of the challenge is on "one-shot-learning", which means training gesture classifiers from a single video clip example of each gesture. The data are split into subtasks, each using a small vocabulary of 8 to 12 gestures, related to a particular application domain: hand signals used by divers, finger codes to represent numerals, signals used by referees, marchalling signals to guide vehicles or aircrafts, etc. We limited the problem to single users for each task and to the recognition of short sequences of gestures punctuated by returning the hands to a resting position. This situation is encountered in computer interface applications, including robotics, education, and gaming. The challenge setting fosters progress in transfer learning by providing for training a large number of subtasks related to, but different from the tasks on which the competitors are tested.
Abstract. The KinectT M camera has revolutionized the field of computer vision by making available low cost 3D cameras recording both RGB and depth data, using a time of flight infrared sensor. We recorded and made available a large database of 50,000 hand and arm gestures. With these data, we organized a challenge emphasizing the problem of learning from very few examples. The data are split into subtasks, each using a small vocabulary of 8 to 12 gestures, related to a particular application domain: hand signals used by divers, finger codes to represent numerals, signals used by referees, marshalling signals to guide vehicles or aircrafts, etc. We limited the problem to single users for each task and to the recognition of short sequences of gestures punctuated by returning the hands to a resting position. This situation is encountered in computer interface applications, including robotics, education, and gaming. The challenge setting fosters progress in transfer learning by providing for training a large number of subtasks related to, but different from the tasks on which the competitors are tested.
Recognizing sign language is a very challenging task in computer vision. One of the more popular approaches, Dynamic Time Warping (DTW), utilizes hand trajectory information to compare a query sign with those in a database of examples. In this work, we conducted an American Sign Language (ASL) recognition experiment on Kinect sign data using DTW for sign trajectory similarity and Histogram of Oriented Gradient (HoG) [5] for hand shape representation. Our results show an improvement over the original work of [14], achieving an 82% accuracy in ranking signs in the 10 matches. In addition to our method that improves sign recognition accuracy, we propose a simple RGB-D alignment tool that can help roughly approximate alignment parameters between the color (RGB) and depth frames.
International audienceThis paper introduces principal motion components (PMC), a new method for one-shot gesture recognition. In the considered scenario a single training video is available for each gesture to be recognized, which limits the application of traditional techniques (e.g., HMMs). In PMC, a 2D map of motion energy is obtained per each pair of consecutive frames in a video. Motion maps associated to a video are processed to obtain a PCA model, which is used for recognition under a reconstruction-error approach. The main benefits of the proposed approach are its simplicity, easiness of implementation, competitive performance and efficiency. We report experimental results in one-shot gesture recognition using the ChaLearn Gesture Dataset; a benchmark comprising more than 50,000 gestures, recorded as both RGB and depth video with a Kinect™camera. Results obtained with PMC are competitive with alternative methods proposed for the same data set
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.