Automatic translation of sign language with multi-stream 3D CNN and generation of artificial depth maps

Castro, Giulia Zanon de; Guerra, Rúbia Reis; Guimarães, Frederico Gadelha

doi:10.1016/j.eswa.2022.119394

Cited by 39 publications

(7 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…ISLR shares a lot of features with action recognition, and consequently there are several works using CNNs for feature extraction and classification [ 32 , 33 , 34 , 35 ]. Recent work has also relied on employing 3D-CNNs [ 36 , 37 ] to capture spatiotemporal information in an ensemble way. In [ 38 , 39 , 40 ], an Inflated 3D ConvNet (I3D) architecture [ 22 ] is proposed, whose application produces significant improvements in ISLR performance.…”

Section: Related Workmentioning

confidence: 99%

Synthetic Corpus Generation for Deep Learning-Based Translation of Spanish Sign Language

Perea-Trigo,

Botella-López,

Martínez-del-Amor

et al. 2024

Sensors

View full text Add to dashboard Cite

Sign language serves as the primary mode of communication for the deaf community. With technological advancements, it is crucial to develop systems capable of enhancing communication between deaf and hearing individuals. This paper reviews recent state-of-the-art methods in sign language recognition, translation, and production. Additionally, we introduce a rule-based system, called ruLSE, for generating synthetic datasets in Spanish Sign Language. To check the usefulness of these datasets, we conduct experiments with two state-of-the-art models based on Transformers, MarianMT and Transformer-STMC. In general, we observe that the former achieves better results (+3.7 points in the BLEU-4 metric) although the latter is up to four times faster. Furthermore, the use of pre-trained word embeddings in Spanish enhances results. The rule-based system demonstrates superior performance and efficiency compared to Transformer models in Sign Language Production tasks. Lastly, we contribute to the state of the art by releasing the generated synthetic dataset in Spanish named synLSE.

show abstract

Section: Related Workmentioning

confidence: 99%

Synthetic Corpus Generation for Deep Learning-Based Translation of Spanish Sign Language

Perea-Trigo,

Botella-López,

Martínez-del-Amor

et al. 2024

Sensors

View full text Add to dashboard Cite

show abstract

“…Although the test accuracy of the proposed method is high, the number of classes is quite low compared to the number of words used in general sign language dictionaries. Castro et al [50] introduced a multi-stream approach involving processing summarized RGB frames, segmented regions of the hands and face, joint distances, and artificially generated depth data through a 3D-CNN. In this method, it was shown that the addition of artificial depth maps increased the generalization capacity for different signers.…”

Section: Related Literaturementioning

confidence: 99%

Enhancing Signer-Independent Recognition of Isolated Sign Language through Advanced Deep Learning Techniques and Feature Fusion

Akdag,

Baykan

2024

Electronics

View full text Add to dashboard Cite

Sign Language Recognition (SLR) systems are crucial bridges facilitating communication between deaf or hard-of-hearing individuals and the hearing world. Existing SLR technologies, while advancing, often grapple with challenges such as accurately capturing the dynamic and complex nature of sign language, which includes both manual and non-manual elements like facial expressions and body movements. These systems sometimes fall short in environments with different backgrounds or lighting conditions, hindering their practical applicability and robustness. This study introduces an innovative approach to isolated sign language word recognition using a novel deep learning model that combines the strengths of both residual three-dimensional (R3D) and temporally separated (R(2+1)D) convolutional blocks. The R3(2+1)D-SLR network model demonstrates a superior ability to capture the intricate spatial and temporal features crucial for accurate sign recognition. Our system combines data from the signer’s body, hands, and face, extracted using the R3(2+1)D-SLR model, and employs a Support Vector Machine (SVM) for classification. It demonstrates remarkable improvements in accuracy and robustness across various backgrounds by utilizing pose data over RGB data. With this pose-based approach, our proposed system achieved 94.52% and 98.53% test accuracy in signer-independent evaluations on the BosphorusSign22k-general and LSA64 datasets.

show abstract

“…[31][32][33][34][35]. This is considered a more challenging task as there is no predefined dataset available for regional languages and all the time authors must collect their own dataset for very few postures [36,37]. The good thing about sensor-based prototypes is that they are each worn and carried in public.…”

Section: Literature Reviewmentioning

confidence: 99%

Assistive Data Glove for Isolated Static Postures Recognition in American Sign Language Using Neural Network

et al. 2023

View full text Add to dashboard Cite

Sign language recognition is one of the most challenging tasks of today’s era. Most of the researchers working in this domain have focused on different types of implementations for sign recognition. These implementations require the development of smart prototypes for capturing and classifying sign gestures. Keeping in mind the aspects of prototype design, sensor-based, vision-based, and hybrid approach-based prototypes have been designed. The authors in this paper have designed sensor-based assistive gloves to capture signs for the alphabet and digits. These signs are a small but important fraction of the ASL dictionary since they play an essential role in fingerspelling, which is a universal signed linguistic strategy for expressing personal names, technical terms, gaps in the lexicon, and emphasis. A scaled conjugate gradient-based back propagation algorithm is used to train a fully-connected neural network on a self-collected dataset of isolated static postures of digits, alphabetic, and alphanumeric characters. The authors also analyzed the impact of activation functions on the performance of neural networks. Successful implementation of the recognition network produced promising results for this small dataset of static gestures of digits, alphabetic, and alphanumeric characters.

show abstract

Automatic translation of sign language with multi-stream 3D CNN and generation of artificial depth maps

Cited by 39 publications

References 30 publications

Synthetic Corpus Generation for Deep Learning-Based Translation of Spanish Sign Language

Synthetic Corpus Generation for Deep Learning-Based Translation of Spanish Sign Language

Enhancing Signer-Independent Recognition of Isolated Sign Language through Advanced Deep Learning Techniques and Feature Fusion

Assistive Data Glove for Isolated Static Postures Recognition in American Sign Language Using Neural Network

Contact Info

Product

Resources

About