Real-time attention-based embedded LSTM for dynamic sign language recognition on edge devices

Sharma, Vaidehi; Sharma, Abhishek; Saini, Sandeep

doi:10.1007/s11554-024-01435-7

J Real-Time Image Proc

2024

DOI: 10.1007/s11554-024-01435-7

|View full text |Cite

Real-time attention-based embedded LSTM for dynamic sign language recognition on edge devices

Vaidehi Sharma,

Abhishek Sharma,

Sandeep Saini

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

Publication Types

Select...

Article2

Relationship

Self Cite0

Independent2

Authors

Journals

Cited by 2 publications

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Learning signs with NAO: humanoid robot as a tool for helping to learn Colombian Sign Language

Mora-Zarate,

Garzón-Castro,

Castellanos Rivillas

2024

Front. Robot. AI

View full text Add to dashboard Cite

Sign languages are one of the main rehabilitation methods for dealing with hearing loss. Like any other language, the geographical location will influence on how signs are made. Particularly in Colombia, the hard of hearing population is lacking from education in the Colombian Sign Language, mainly due of the reduce number of interpreters in the educational sector. To help mitigate this problem, Machine Learning binded to data gloves or Computer Vision technologies have emerged to be the accessory of sign translation systems and educational tools, however, in Colombia the presence of this solutions is scarce. On the other hand, humanoid robots such as the NAO have shown significant results when used to support a learning process. This paper proposes a performance evaluation for the design of an activity to support the learning process of all the 11 color-based signs from the Colombian Sign Language. Which consists of an evaluation method with two modes activated through user interaction, the first mode will allow to choose the color sign to be evaluated, and the second will decide randomly the color sign. To achieve this, MediaPipe tool was used to extract torso and hand coordinates, which were the input for a Neural Network. The performance of the Neural Network was evaluated running continuously in two scenarios, first, video capture from the webcam of the computer which showed an overall F1 score of 91.6% and a prediction time of 85.2 m, second, wireless video streaming with NAO H25 V6 camera which had an F1 score of 93.8% and a prediction time of 2.29 s. In addition, we took advantage of the joint redundancy that NAO H25 V6 has, since with its 25 degrees of freedom we were able to use gestures that created nonverbal human-robot interactions, which may be useful in future works where we want to implement this activity with a deaf community.

show abstract

Learning signs with NAO: humanoid robot as a tool for helping to learn Colombian Sign Language

Mora-Zarate,

Garzón-Castro,

Castellanos Rivillas

2024

Front. Robot. AI

View full text Add to dashboard Cite

show abstract

ML-Based Edge Node for Monitoring Peoples’ Frailty Status

Nocera,

Senigagliesi,

Ciattaglia

et al. 2024

Sensors

View full text Add to dashboard Cite

The development of contactless methods to assess the degree of personal hygiene in elderly people is crucial for detecting frailty and providing early intervention to prevent complete loss of autonomy, cognitive impairment, and hospitalisation. The unobtrusive nature of the technology is essential in the context of maintaining good quality of life. The use of cameras and edge computing with sensors provides a way of monitoring subjects without interrupting their normal routines, and has the advantages of local data processing and improved privacy. This work describes the development an intelligent system that takes the RGB frames of a video as input to classify the occurrence of brushing teeth, washing hands, and fixing hair. No action activity is considered. The RGB frames are first processed by two Mediapipe algorithms to extract body keypoints related to the pose and hands, which represent the features to be classified. The optimal feature extractor results from the most complex Mediapipe pose estimator combined with the most complex hand keypoint regressor, which achieves the best performance even when operating at one frame per second. The final classifier is a Light Gradient Boosting Machine classifier that achieves more than 94% weighted F1-score under conditions of one frame per second and observation times of seven seconds or more. When the observation window is enlarged to ten seconds, the F1-scores for each class oscillate between 94.66% and 96.35%.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Real-time attention-based embedded LSTM for dynamic sign language recognition on edge devices

Cited by 2 publications

References 24 publications

Learning signs with NAO: humanoid robot as a tool for helping to learn Colombian Sign Language

Learning signs with NAO: humanoid robot as a tool for helping to learn Colombian Sign Language

ML-Based Edge Node for Monitoring Peoples’ Frailty Status

Contact Info

Product

Resources

About