Hand Detection by Two-Level Segmentation with Double-Tracking and Gesture Recognition Using Deep-Features

Sarma, Debajit; Bhuyan, M. K.

doi:10.1007/s11220-022-00379-1

Cited by 12 publications

(6 citation statements)

References 38 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, the integration of multiple features to enhance segmentation accuracy compromised realtime performance. In 2022, Sarma et al [9] combined skin color segmentation and motion-based frame difference segmentation to achieve a two-stage segmentation of moving hands, effectively addressing lighting variations and selfocclusion challenges in gesture videos. Nonetheless, a primary drawback lies in its limited ability to handle complex backgrounds.…”

Section: Dynamic Gesture Segmentationmentioning

confidence: 99%

Review of vision-based gesture recognition technology

bao,

dai,

2024

International Conference on Optics and Machine Vision (ICOMV 2024)

View full text Add to dashboard Cite

With the emergence of deep learning methodologies, vision-based gesture recognition technology has continuously advanced. This paper primarily delves into four main stages of vision-based gesture recognition: gesture segmentation, gesture tracking, feature extraction, and gesture classification. It sequentially introduces pertinent techniques from representative literature spanning from 2018 to 2023. Based on this analysis, the current status of vision-based gesture recognition technology is examined, paving the way for predicting its future trends and developments.

show abstract

Section: Dynamic Gesture Segmentationmentioning

confidence: 99%

Review of vision-based gesture recognition technology

bao,

dai,

2024

International Conference on Optics and Machine Vision (ICOMV 2024)

View full text Add to dashboard Cite

show abstract

“…This is achieved by modeling the pixel distributions with Gaussian mixture models (GMMs). In [ 17 ], the authors also propose background removal based on skin and motion segmentation to facilitate the classification model. In [ 35 ], the segmentation was performed using the depth channel of a Kinect RGB-D camera.…”

Section: Related Workmentioning

confidence: 99%

“…In line with the trend of learning systems, convolutional neural networks (CNNs) have been successfully applied in image recognition tasks, while recurrent neural networks (RNNs) are a natural choice in recognizing gestures from videos [ 14 , 15 ]. One solution for HGR is to include a hand segmentation module as the first stage of the pipeline [ 16 , 17 ]. The process of separating the hand from the background allows the recognition model to focus on the relevant information of the input image while reducing the impact of variations in the background or lighting conditions.…”

Section: Introductionmentioning

confidence: 99%

Domain Adaptation with Contrastive Simultaneous Multi-Loss Training for Hand Gesture Recognition

Baptista

Santos

Silva

et al. 2023

Sensors

View full text Add to dashboard Cite

Hand gesture recognition from images is a critical task with various real-world applications, particularly in the field of human–robot interaction. Industrial environments, where non-verbal communication is preferred, are significant areas of application for gesture recognition. However, these environments are often unstructured and noisy, with complex and dynamic backgrounds, making accurate hand segmentation a challenging task. Currently, most solutions employ heavy preprocessing to segment the hand, followed by the application of deep learning models to classify the gestures. To address this challenge and develop a more robust and generalizable classification model, we propose a new form of domain adaptation using multi-loss training and contrastive learning. Our approach is particularly relevant in industrial collaborative scenarios, where hand segmentation is difficult and context-dependent. In this paper, we present an innovative solution that further challenges the existing approach by testing the model on an entirely unrelated dataset with different users. We use a dataset for training and validation and demonstrate that contrastive learning techniques in simultaneous multi-loss functions provide superior performance in hand gesture recognition compared to conventional approaches in similar conditions.

show abstract

“…For a complex and changing background environment, segmentation may be very difficult due to the variation in shape and appearance of body/body-part depending on many factors like clothing, illumination variation, image resolution, etc. In [11,12], the authors used the skin segmentation method to segment the hand portion from the background. But this method had issues when there were some skin-color like objects present in the background.…”

Section: Introductionmentioning

confidence: 99%

Two-stream fusion model using 3D-CNN and 2D-CNN via video-frames and optical flow motion templates for hand gesture recognition

Sarma

Kavyasree

Bhuyan

2022

Innovations Syst Softw Eng

Self Cite

View full text Add to dashboard Cite

Hand gestures are useful tools for many applications in the human-computer interaction community. Here, the objective is to track the movement of the hand irrespective of the shape, size and color of the hand. And, for this, a motion template guided by optical flow (OFMT) is proposed. OFMT is a compact representation of the motion information of a gesture encoded into a single image. Recently, deep networks have shown impressive improvements as compared to conventional hand-crafted feature-based techniques. Moreover, it is seen that the use of different streams with informative input data helps to increase the recognition performance. This work basically proposes a two-stream fusion model for hand gesture recognition. The two-stream network consists of two layers—a 3D convolutional neural network (C3D) that takes gesture videos as input and a 2D-CNN that takes OFMT images as input. C3D has shown its efficiency in capturing spatiotemporal information of a video, whereas OFMT helps to eliminate irrelevant gestures providing additional motion information. Though each stream can work independently, they are combined with a fusion scheme to boost the recognition results. We have shown the efficiency of the proposed two-stream network on two databases.

show abstract

Hand Detection by Two-Level Segmentation with Double-Tracking and Gesture Recognition Using Deep-Features

Cited by 12 publications

References 38 publications

Review of vision-based gesture recognition technology

Review of vision-based gesture recognition technology

Domain Adaptation with Contrastive Simultaneous Multi-Loss Training for Hand Gesture Recognition

Two-stream fusion model using 3D-CNN and 2D-CNN via video-frames and optical flow motion templates for hand gesture recognition

Contact Info

Product

Resources

About