American Sign Language Words Recognition of Skeletal Videos Using Processed Video Driven Multi-Stacked Deep LSTM

Abdullahi, Sunusi Bala; Chamnongthai, Kosin

doi:10.3390/s22041406

Cited by 35 publications

(41 citation statements)

References 71 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Although this method is highly accurate and takes less time to process, and also involves few background problems, this model can only be used with one hand; however, sign language sometimes involves the use of both hands. The second group is the double-hand group [12]. A double-hand-based method was introduced using Leap Motion sensors to solve the problem of similar shapes but different movements and rotations.…”

Section: Related Workmentioning

confidence: 99%

“…These methods used the 3D model feature extraction method, resulting in high accuracy. More importantly, the backhand approach has been applied [2,3,6,11,12], which is essential for the practical application of the system in daily life. However, these features are hard to use in the case of SRM signs because of the problems with their similar shape, rotation, and movement.…”

Section: Related Workmentioning

confidence: 99%

“…Consequently, the features used in the existing work may not be sufficient to distinguish these types of sign words. In the previous research on the backhand view [2,12], a similar problem was solved using rotation-based analysis, such as pitch, roll, and yaw angles. Moreover, another previous work [3,6,11] used positionbased feature extraction based on a backhand view.…”

Section: Problem Analysismentioning

confidence: 99%

“…In addition, these features are used with a forehand view, which leads to misclassification; therefore, it is difficult to apply this method with a backhand view. The authors in [2,3,6,11,12] extracted hand features using a backhand view approach, which led to outstanding performance; however, these studies may fail to recognize words with a similar shape, rotation, and movement (SRM words). To address this problem with the existing works of [2,3,6,11,12], in this paper we propose the spatial-temporal body parts and hand relationship patterns (ST-BHR patterns) as the main feature of analysis for 72 isolated signed words in the SRM group [13] based on the backhand approach.…”

Section: Introductionmentioning

confidence: 99%

“…The authors in [2,3,6,11,12] extracted hand features using a backhand view approach, which led to outstanding performance; however, these studies may fail to recognize words with a similar shape, rotation, and movement (SRM words). To address this problem with the existing works of [2,3,6,11,12], in this paper we propose the spatial-temporal body parts and hand relationship patterns (ST-BHR patterns) as the main feature of analysis for 72 isolated signed words in the SRM group [13] based on the backhand approach. In this method, both single and double hands were applied, and their information was obtained using the 3D distance-based Cartesian product obtained from a 3D depth camera.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Backhand-Approach-Based American Sign Language Words Recognition Using Spatial-Temporal Body Parts and Hand Relationship Patterns

Chophuk

Chamnongthai

Chinnasarn

2022

Sensors

Self Cite

View full text Add to dashboard Cite

Most of the existing methods focus mainly on the extraction of shape-based, rotation-based, and motion-based features, usually neglecting the relationship between hands and body parts, which can provide significant information to address the problem of similar sign words based on the backhand approach. Therefore, this paper proposes four feature-based models. The spatial–temporal body parts and hand relationship patterns are the main feature. The second model consists of the spatial–temporal finger joint angle patterns. The third model consists of the spatial–temporal 3D hand motion trajectory patterns. The fourth model consists of the spatial–temporal double-hand relationship patterns. Then, a two-layer bidirectional long short-term memory method is used to deal with time-independent data as a classifier. The performance of the method was evaluated and compared with the existing works using 26 ASL letters, with an accuracy and F1-score of 97.34% and 97.36%, respectively. The method was further evaluated using 40 double-hand ASL words and achieved an accuracy and F1-score of 98.52% and 98.54%, respectively. The results demonstrated that the proposed method outperformed the existing works under consideration. However, in the analysis of 72 new ASL words, including single- and double-hand words from 10 participants, the accuracy and F1-score were approximately 96.99% and 97.00%, respectively.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Problem Analysismentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Backhand-Approach-Based American Sign Language Words Recognition Using Spatial-Temporal Body Parts and Hand Relationship Patterns

Chophuk

Chamnongthai

Chinnasarn

2022

Sensors

Self Cite

View full text Add to dashboard Cite

show abstract

Spatial–temporal attention with graph and general neural network-based sign language recognition

Miah,

Hasan,

Okuyama

et al. 2024

Pattern Anal Applic

View full text Add to dashboard Cite

Automatic Sign Language Recognition (SLR) stands as a vital aspect within the realms of Human-Computer Interaction (HCI) and computer vision, facilitating the conversion of hand signs-utilized by individuals with significant hearing and speech impairments-into equivalent text or voice. Researchers have recently used hand skeleton joint information instead of the image pixel due to light illumination and complex background-bound problems. However, besides the hand information, body motion and facial gestures play an essential role in expressing sign language emotion. Also, a few researchers have been working to develop an SLR system by taking a multi-gesture dataset, but their performance accuracy and time complexity are not sufficient. In light of these limitations, we introduce a Spatial and Temporal Attention Model amalgamated with a General Neural Network designed for the SLR system. The main idea of our architecture is first to construct a fully connected graph to project the skeleton information. We employ self-attention mechanisms to extract insights from node Article Title and edge features across spatial and temporal domains. Our architecture bifurcates into three branches: a graph-based spatial branch, a graph-based temporal branch, and a general neural network branch, which collectively synergize to contribute to the final feature integration. Specifically, the spatial branch discerns spatial dependencies, while the temporal branch amplifies temporal dependencies embedded within the sequential hand skeleton data. Further, the general neural network branch enhances the architecture's generalization capabilities, bolstering its robustness. In our evaluation, utilizing the Mexican Sign Language(MSL), Pakistani Sign Langauge(PSL) datasets and American Sign Langauge Large Video Dataset(ASSLVD) -which comprises 3D joint coordinates for face, body, and hands that conducted experiments on individual gestures and their combinations. Impressively, our model demonstrated notable efficacy, achieving an accuracy rate of 99.96% for the MSL dataset, 92.00% for PSL, and 26.00% for the ASLLVD dataset, which includes 2745 classes. These exemplary performance metrics, coupled with the model's computationally efficient profile, underscore its preeminence compared to contemporaneous methodologies in the field.

show abstract

SILK-SVM: An Effective Machine Learning Based Key-Frame Extraction Approach for Dynamic Hand Gesture Recognition

Kaur,

Bansal

2024

Arab J Sci Eng

View full text Add to dashboard Cite

American Sign Language Words Recognition of Skeletal Videos Using Processed Video Driven Multi-Stacked Deep LSTM

Cited by 35 publications

References 71 publications

Backhand-Approach-Based American Sign Language Words Recognition Using Spatial-Temporal Body Parts and Hand Relationship Patterns

Backhand-Approach-Based American Sign Language Words Recognition Using Spatial-Temporal Body Parts and Hand Relationship Patterns

Spatial–temporal attention with graph and general neural network-based sign language recognition

SILK-SVM: An Effective Machine Learning Based Key-Frame Extraction Approach for Dynamic Hand Gesture Recognition

Contact Info

Product

Resources

About