2024
DOI: 10.1145/3584984
|View full text |Cite
|
Sign up to set email alerts
|

Isolated Arabic Sign Language Recognition Using a Transformer-based Model and Landmark Keypoints

Abstract: Pose-based approaches for sign language recognition provide light-weight and fast models that can be adopted in real-time applications. This paper presented a framework for isolated Arabic sign language (ArSL) recognition using hand and face keypoints. We employed MediaPipe pose estimator for extracting the keypoints of sign gestures in the video stream. Using the extracted keypoints, three models were proposed for sign language recognition, Long-Term Short Memory (LSTM), Temporal Convolution Networks (TCN) an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
24
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 28 publications
(24 citation statements)
references
References 50 publications
0
24
0
Order By: Relevance
“…Another study [ 35 ] used a larger dataset of 100 classes and pose-based transformers for Arabic Sign Language recognition. The authors also used signer-independent mode for evaluation, as we did.…”
Section: Resultsmentioning
confidence: 99%
See 3 more Smart Citations
“…Another study [ 35 ] used a larger dataset of 100 classes and pose-based transformers for Arabic Sign Language recognition. The authors also used signer-independent mode for evaluation, as we did.…”
Section: Resultsmentioning
confidence: 99%
“…However, they only achieved an accuracy of 68.2%, while we achieved 87.69%. Additionally, the KArSL-100 dataset used in [ 35 ] was recorded from only three subjects, while our dataset had data from four subjects. This suggests that our proposed model is more robust to variations in signers’ hand movements.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…They also developed an encoder-decoder model for recognizing sign language sentences and achieved a word error rate (WER) of 0.50 on average. The authors [26] introduced a Transformer model based on stance, specifically tailored for the KArSL-100 dataset. This dataset consists of 100 different classes and is focused on recognizing sign videos and were able to attain a 68.2% accuracy rate while using a signer-independent mode.…”
Section: Related Workmentioning
confidence: 99%