2023
DOI: 10.3390/app13053029
|View full text |Cite
|
Sign up to set email alerts
|

Korean Sign Language Recognition Using Transformer-Based Deep Neural Network

Abstract: Sign language recognition (SLR) is one of the crucial applications of the hand gesture recognition and computer vision research domain. There are many researchers who have been working to develop a hand gesture-based SLR application for English, Turkey, Arabic, and other sign languages. However, few studies have been conducted on Korean sign language classification because few KSL datasets are publicly available. In addition, the existing Korean sign language recognition work still faces challenges in being co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
41
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
4
1

Relationship

1
8

Authors

Journals

citations
Cited by 44 publications
(41 citation statements)
references
References 60 publications
0
41
0
Order By: Relevance
“…In the traditional VIT architecture, input signals are divided into non-overlapping patches and mapped to the input dimension through linear projection. However, modeling the structural information within local patches solely through linear projection can be less effective [38], as it overlooks local relationships and internal structural information within the patches [39]. In the field of video representation learning, one approach is to aggregate pixel-level features for each frame by adding one-dimensional convolution (1D-CNN) to capture temporal clues between the same spatial positions [40].…”
Section: D-cnn Embedding Layermentioning
confidence: 99%
“…In the traditional VIT architecture, input signals are divided into non-overlapping patches and mapped to the input dimension through linear projection. However, modeling the structural information within local patches solely through linear projection can be less effective [38], as it overlooks local relationships and internal structural information within the patches [39]. In the field of video representation learning, one approach is to aggregate pixel-level features for each frame by adding one-dimensional convolution (1D-CNN) to capture temporal clues between the same spatial positions [40].…”
Section: D-cnn Embedding Layermentioning
confidence: 99%
“…The Korean SL (KSL) dataset is useful for real-life scenarios. SL data are selected from 20 persons around 17 locations [34]. Also, the dataset gathered the signers' facial expressions along with their hand gesticulation.…”
Section: Dataset Descriptionmentioning
confidence: 99%
“…In this subsection, the proposed method's effectiveness is compared against various state-of-theart approaches, such as Meta-ELM [12], SVM-CNN [17], LSTMRNN-KNN [18], T-CNN [19], TDDRMN [25], and PSO-CNN [27]. The results of the comparative study are detailed below.…”
Section: Comparative Analysismentioning
confidence: 99%
“…Consequently, there has been a rising interest in leveraging machine learning methods, notably deep learning techniques, to enhance the precision and efficiency of skin cancer diagnosis. Over the past few years, deep learning methods [ 26 ], particularly convolutional neural networks (CNNs) [ 17 , 23 , 26 , 27 , 28 ], have demonstrated significant potential for accurately identifying and categorizing skin cancer from medical images of skin lesions [ 7 , 16 , 24 , 25 ]. In this review of relevant literature, we explore recent studies focusing on the detection and classification of skin cancer through the application of deep learning and transfer learning methodologies.…”
Section: Literature Reviewmentioning
confidence: 99%