Signer-Independent Arabic Sign Language Recognition System Using Deep Learning Model

Podder, Kanchon Kanti; Ezeddin, Maymouna; Chowdhury, Muhammad E. H.; Sumon, Md. Shaheenur Islam; Tahir, Anas M.; Ayari, Mohamed Arselene; Dutta, Proma; Khandakar, Amith; Mahbub, Zaid Bin; Kadir, Muhammad Abdul

doi:10.3390/s23167156

Cited by 16 publications

(12 citation statements)

References 50 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…ANFIS has been proven pioneering by many researchers; however, it is counteracted by its significant weakness in swiftly adapting and performing intricate tasks of gesture recognition, an essential criterion in any sign language recognition system. As a result, ANFIS has achieved a high precision of 86.69%, but its performance and adaptation ability to intricate gesture recognition tasks remain limited (Podder et al, 2023). A detailed comparison of the proposed model with SOTA approaches is presented with an accuracy of 94.46% (Aldhahri et al, 2023) presented in (Table 1).…”

Section: Discussionmentioning

confidence: 99%

Efficient YOLO-Based Deep Learning Model for Arabic Sign Language Recognition

Al Ahmadi,

Mohammad,

Al Dawsari

2024

Journal of Disability Research

View full text Add to dashboard Cite

Verbal communication is the dominant form of self-expression and interpersonal communication. Speech is a considerable obstacle for individuals with disabilities, including those who are deaf, hard of hearing, mute, and nonverbal. Sign language is a complex system of gestures and visual signs facilitating individual communication. With the help of artificial intelligence, the hearing and the deaf can communicate more easily. Automatic detection and recognition of sign language is a complex and challenging task in computer vision and machine learning. This paper proposes a novel technique using deep learning to recognize the Arabic Sign Language (ArSL) accurately. The proposed method relies on advanced attention mechanisms and convolutional neural network architecture integrated with a robust You Only Look Once (YOLO) object detection model that improves the detection and recognition rate of the proposed technique. In our proposed method, we integrate the self-attention block, channel attention module, spatial attention module, and cross-convolution module into feature processing for accurate detection. The recognition accuracy of our method is significantly improved, with a higher detection rate of 99%. The methodology outperformed conventional methods, achieving a precision rate of 0.9 and a mean average precision (mAP) of 0.9909 at an intersection over union (IoU) of 0.5. From IoU thresholds of 0.5 to 0.95, the mAP continuously remains high, indicating its effectiveness in accurately identifying signs at different precision levels. The results show the model’s robustness in accurately detecting and classifying complex multiple ArSL signs. The results show the robustness and efficacy of the proposed model.

show abstract

Section: Discussionmentioning

confidence: 99%

Efficient YOLO-Based Deep Learning Model for Arabic Sign Language Recognition

Al Ahmadi,

Mohammad,

Al Dawsari

2024

Journal of Disability Research

View full text Add to dashboard Cite

show abstract

“…The transformer is trained from augmented MediaPipe poses + 33 landmarks and returns an accuracy of 68.2% from user independence mode. Podder et al [53] proposed features from the face-hand region-based segmentation and SelfMLPinfused MobileNetV2-LSTM-SelfMLP. Overall accuracy of 88.57%.…”

Section: B Skeletal Representation-based Arabic Sign Language Recogni...mentioning

confidence: 99%

“…We loop through each C in the SFC. We [53] Light model Limited data set MobileNetV2-LSTM-SelfMLP (q = 3) 88.57 Alyami et al [52] Sequential learning Low accuracy Transformer-based poses + landmarks SD 99.7 and SI 68.2 Balaha et al [34] Dynamic ArSLR Limited data set 20 ArSL words and hybrid CNN-RNN 98 AlSulaiman et al [54] Effective image modeling Limited data set 3D-GCN vertices and edges 97.25 Hany et al [35] Novel ArSL characters Letters only Augmented Q-CNN-based features 99.54 at 42 min Aldhahri et al [36] Light…”

Section: B Skeletal Feature Thresholdingmentioning

confidence: 99%

Enhanced Weak Spatial Modeling Through CNN-Based Deep Sign Language Skeletal Feature Transformation

Alamri,

Bala Abdullahi,

Khan

et al. 2024

IEEE Access

View full text Add to dashboard Cite

Recent sign language skeletal-based feature models (SLSm) consist of various distracting coordinates that lead to complex deep-learning modeling. However, SLSm is not purely a spatial-temporal coordinate arrangement problem; it is also limited by human dynamics and feature aggregations. The objectives of this work are twofold: (a) to transform the skeletal features of the SLSm model to address the problem of variations in viewpoint and changes across features of repeated signs due to human dynamics, and (b) to exploit the potential of exhaustive searching in dropping distracting features to prevent complex deep learning modeling. Method: We propose a transformed skeletal feature-based model (SCT) from a feature thresholding theory. We first extract the hand-skeletal joint-related features relevant to the coordinates and positions of the hand transcription that efficiently capture human dynamics. The extracted features are transformed into a subset of a predefined threshold and fed into the proposed ensemble exhaustive feature searching. The searched features are transformed into their equivalent deep input image sequences. Outcomes: By leveraging the skeletal-based transformed and deep spatial features, the proposed method demonstrates robust performance in sign language recognition, surpassing recent deep learning models in accuracy and simplicity. The proposed skeletal features demonstrate superiority in learning complex hand gestures of public data sets, improving accuracy by more than 2%.INDEX TERMS Human-computer interaction, End-to-end deep neural network, Multimodal data interaction, Hand gestures, Sign language recognition, and Pattern recognition.

show abstract

“…In the field of image processing, there has been an increase in popularity and pervasive adoption of deep learning and machine learning techniques [7,8]. Deep learning and computer vision can also be employed to help advance this goal, ensuring ease of use [9,10]. A Recurrent Neural Network is a form of neural network that incorporates loops for internal data storage [11].…”

Section: Introductionmentioning

confidence: 99%

Sign Language Word Detection Using LRCN

Sumon,

Ali,

Bari

et al. 2024

IOP Conf. Ser.: Mater. Sci. Eng.

Self Cite

View full text Add to dashboard Cite

Sign language is the most effective communication for deaf or hard-of-hearing people. Specialized training is required to understand sign language, and as such, people without disabilities around them cannot communicate effectively. The main objective of this study is to develop a mechanism for streamlining the deep learning model for sign language recognition by utilizing the 30 most prevalent words in our everyday lives. The dataset was designed through 30 ASL (American Sign Language) words consisting of custom-processed video sequences, which consist of 5 subjects and 50 sample videos for each class. The CNN model can be applied to video frames to extract spatial properties. Using CNN’s acquired data, the LSTM model may then predict the action being performed in the video. We present and evaluate the results of two separate datasets—the Pose dataset and the Raw video dataset. The dataset was trained with the Long-term Recurrent Convolutional Network (LRCN) approach. Finally, a test accuracy of 92.66% was reached for the raw dataset, while 93.66% for the pose dataset.

show abstract

Signer-Independent Arabic Sign Language Recognition System Using Deep Learning Model

Cited by 16 publications

References 50 publications

Efficient YOLO-Based Deep Learning Model for Arabic Sign Language Recognition

Efficient YOLO-Based Deep Learning Model for Arabic Sign Language Recognition

Enhanced Weak Spatial Modeling Through CNN-Based Deep Sign Language Skeletal Feature Transformation

Sign Language Word Detection Using LRCN

Contact Info

Product

Resources

About