2023
DOI: 10.1038/s41598-023-43852-x
|View full text |Cite
|
Sign up to set email alerts
|

Sign language recognition using the fusion of image and hand landmarks through multi-headed convolutional neural network

Refat Khan Pathan,
Munmun Biswas,
Suraiya Yasmin
et al.

Abstract: Sign Language Recognition is a breakthrough for communication among deaf-mute society and has been a critical research topic for years. Although some of the previous studies have successfully recognized sign language, it requires many costly instruments including sensors, devices, and high-end processing power. However, such drawbacks can be easily overcome by employing artificial intelligence-based techniques. Since, in this modern era of advanced mobile technology, using a camera to take video or images is m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 26 publications
(6 citation statements)
references
References 26 publications
0
6
0
Order By: Relevance
“…The results in Table 4 show that our model performs well on various evaluation metrics. Relative to the methods of Refat Khan Pathan et al 36 and SHIH-HUNG YANG et al 37 , our model yields a better average test accuracy of 99.52%, which indicates that our proposed model is effective. Compared with these methods, the proposed model is lightweight.…”
Section: Methodsmentioning
confidence: 63%
See 2 more Smart Citations
“…The results in Table 4 show that our model performs well on various evaluation metrics. Relative to the methods of Refat Khan Pathan et al 36 and SHIH-HUNG YANG et al 37 , our model yields a better average test accuracy of 99.52%, which indicates that our proposed model is effective. Compared with these methods, the proposed model is lightweight.…”
Section: Methodsmentioning
confidence: 63%
“…), the proposed method in this study is very simple. The parameter sizes of the models proposed by Pathan et al 36 and Yang et al 37 are 1.88 M and 21.24 M, respectively. However, the parameter sizes of the DPCNN is 0.06 M. Therefore, the proposed model can be used in small terminals with limited resources.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Gangrade and Bharti 30 realized a vision-based Indian sign language gesture recognition by using CNN. Pathan et al 31 used multi-head CNN to fuse images and hand marks and sign language recognition. Zhang et al 32 used deep learning and transfer learning to realize online electromyography gesture recognition.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Similar to gesture images, the data of these 21 key points can be used as features for training gesture recognition models [16] . Refat Khan Pathan et al [17] obtained 96.29% accuracy for the image and 98.42% accuracy for the hand features by testing the hand image and the 21 key points features of the hand separately from the "ASL Finger Spelling" [14] dataset. They then fused the two into a multi-head convolutional network and obtained a test accuracy of 98.98 percent, which was better than the two data alone.…”
Section: Introductionmentioning
confidence: 99%