The ChaLearn gesture dataset (CGD 2011)

Guyon, Isabelle; Athitsos, Vassilis; Jangyodsuk, Pat; Escalante, Hugo Jair

doi:10.1007/s00138-014-0596-3

Cited by 91 publications

(46 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Every gesture has one training sample. For the sake of comparison, the Mean Levenshtein Distance (MLD) score [41], which was used by the challenge organizers, is adopted to evaluate the recognition performance. The recognition accuracy increases as the MLD score decreases, and vice versa.…”

Section: Resultsmentioning

confidence: 99%

Adaptive Local Spatiotemporal Features from RGB-D Data for One-Shot Learning Gesture Recognition

Lin

Ruan

et al. 2016

Sensors

View full text Add to dashboard Cite

Noise and constant empirical motion constraints affect the extraction of distinctive spatiotemporal features from one or a few samples per gesture class. To tackle these problems, an adaptive local spatiotemporal feature (ALSTF) using fused RGB-D data is proposed. First, motion regions of interest (MRoIs) are adaptively extracted using grayscale and depth velocity variance information to greatly reduce the impact of noise. Then, corners are used as keypoints if their depth, and velocities of grayscale and of depth meet several adaptive local constraints in each MRoI. With further filtering of noise, an accurate and sufficient number of keypoints is obtained within the desired moving body parts (MBPs). Finally, four kinds of multiple descriptors are calculated and combined in extended gradient and motion spaces to represent the appearance and motion features of gestures. The experimental results on the ChaLearn gesture, CAD-60 and MSRDailyActivity3D datasets demonstrate that the proposed feature achieves higher performance compared with published state-of-the-art approaches under the one-shot learning setting and comparable accuracy under the leave-one-out cross validation.

show abstract

Section: Resultsmentioning

confidence: 99%

Adaptive Local Spatiotemporal Features from RGB-D Data for One-Shot Learning Gesture Recognition

Lin

Ruan

et al. 2016

Sensors

View full text Add to dashboard Cite

show abstract

“…The dataset was formed by re-annotating the ChaLearn 2011 Gesture Dataset [20] to enable evaluation of user-independent recognition. ConGD contains 47,933 gesture samples belonging to 249 gesture classes performed by 21 subjects.…”

Section: Discussionmentioning

confidence: 99%

Particle Filter Based Probabilistic Forced Alignment for Continuous Gesture Recognition

Camgöz¹,

Hadfield²,

Bowden³

2017

2017 IEEE International Conference on Computer Vision Workshops (ICCVW)

View full text Add to dashboard Cite

In this paper, we propose a novel particle filter based probabilistic forced alignment approach for training spatiotemporal deep neural networks using weak border level annotations.The proposed method jointly learns to localize and recognize isolated instances in continuous streams. This is done by drawing training volumes from a prior distribution of likely regions and training a discriminative 3D-CNN from this data. The classifier is then used to calculate the posterior distribution by scoring the training examples and using this as the prior for the next sampling stage.We apply the proposed approach to the challenging task of large-scale user-independent continuous gesture recognition. We evaluate the performance on the popular ChaLearn 2016 Continuous Gesture Recognition (ConGD) dataset. Our method surpasses state-of-the-art results by obtaining 0.3646 and 0.3744 Mean Jaccard Index Score on the validation and test sets of ConGD, respectively. Furthermore, we participated in the ChaLearn 2017 Continuous Gesture Recognition Challenge and was ranked 3rd. It should be noted that our method is learner independent, it can be easily combined with other approaches.

show abstract

“…In the past few years, there are some works on zero/one-shot learning. For example, Wan et al [138] proposed the novel spatial-temporal features for one-shot learning gesture recognition and have got promising performances on Chalearn Gesture Dataset CGD) [45]. For zeroshot learning, Madapana and Wachs [88] proposed a new paradigm based on adaptive learning which it is possible to determine the amount of transfer learning carried out by the algorithm and how much knowledge is acquired for a new gesture observation.…”

Section: Future Research Directionsmentioning

confidence: 99%

RGB-D-based human motion recognition with deep learning: A survey

Wang

Ogunbona

et al. 2018

Computer Vision and Image Understanding

351

183

View full text Add to dashboard Cite

Human motion recognition is one of the most important branches of human-centered research activities. In recent years, motion recognition based on RGB-D data has attracted much attention. Along with the development in artificial intelligence, deep learning techniques have gained remarkable success in computer vision. In particular, convolutional neural networks (CNN) have achieved great success for image-based tasks, and recurrent neural networks (RNN) are renowned for sequence-based problems. Specifically, deep learning methods based on the CNN and RNN architectures have been adopted for motion recognition using RGB-D data. In this paper, a detailed overview of recent advances in RGB-D-based motion recognition is presented. The reviewed methods are broadly categorized into four groups, depending on the modality adopted for recognition: RGB-based, depth-based, skeleton-based and RGB+D-based.As a survey focused on the application of deep learning to RGB-D-based motion recognition, we explicitly discuss the advantages and limitations of existing techniques. Particularly, we highlighted the methods of encoding spatialtemporal-structural information inherent in video sequence, and discuss potential directions for future research. body parts, in contrast with the few body parts that involved in gesture. An activity is composed by a sequence of actions. An interaction is a type of motion performed by two actors; one actor is human while the other may be human or an object. This implies that the interaction category will include human-human or human-object interaction."Hugging each other" and "playing guitar" are examples of these two kinds of interaction, respectively. Group activity is the most complex type of activity, and it may be a combination of gestures, actions and interactions. Necessarily, it involves more than two humans and from zero to multiple objects. Examples of group activities would include "two teams playing basketball" and "group meeting".Early research on human motion recognition was dominated by the analysis of still images or videos [2,144,132,99,44,176]. Most of these efforts used color and texture cues in 2D images for recognition. However, the task remains challenging due to problems posed by background clutter, partial occlusion, view-point, lighting changes, execution rate and biometric variation. This challenge remains even with current deep learning approaches [49,4].With the recent development of cost-effective RGB-D sensors, such as Microsoft Kinect TM and Asus Xtion TM , RGB-D-based motion recognition has attracted much attention. This is largely because the extra dimension (depth) is insensitive to illumination changes and includes rich 3D structural information of the scene. Additionally, 3D positions of body joints can be estimated from depth maps [114]. As a consequence, several methods based on RGB-D data have been proposed and the approach has proven to be a promising direction for human motion analysis.Several survey papers have summarized the research on human motion recognition...

show abstract

The ChaLearn gesture dataset (CGD 2011)

Cited by 91 publications

References 16 publications

Adaptive Local Spatiotemporal Features from RGB-D Data for One-Shot Learning Gesture Recognition

Adaptive Local Spatiotemporal Features from RGB-D Data for One-Shot Learning Gesture Recognition

Particle Filter Based Probabilistic Forced Alignment for Continuous Gesture Recognition

RGB-D-based human motion recognition with deep learning: A survey

Contact Info

Product

Resources

About