Hand Sign Recognition for Thai Finger Spelling: an Application of Convolution Neural Network

Nakjai, Pisit; Katanyukul, Tatpong

doi:10.1007/s11265-018-1375-6

Cited by 42 publications

(46 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…ASL is the foundation of Thai finger-spelling sign language (TFSL).The TFSL was invented in 1953 by Khunying Kamala Krairuek using American finger-spelling as a prototype to represent the 42 Thai consonants, 32 vowels, and 6 intonation marks [2]. All forty-two Thai letters can be presented with a combination of twenty-five hand gestures.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Multi-Stroke Thai Finger-Spelling Sign Language Recognition System with Deep Learning

Pariwat

Seresangtakul

2021

Symmetry

View full text Add to dashboard Cite

Sign language is a type of language for the hearing impaired that people in the general public commonly do not understand. A sign language recognition system, therefore, represents an intermediary between the two sides. As a communication tool, a multi-stroke Thai finger-spelling sign language (TFSL) recognition system featuring deep learning was developed in this study. This research uses a vision-based technique on a complex background with semantic segmentation performed with dilated convolution for hand segmentation, hand strokes separated using optical flow, and learning feature and classification done with convolution neural network (CNN). We then compared the five CNN structures that define the formats. The first format was used to set the number of filters to 64 and the size of the filter to 3 × 3 with 7 layers; the second format used 128 filters, each filter 3 × 3 in size with 7 layers; the third format used the number of filters in ascending order with 7 layers, all of which had an equal 3 × 3 filter size; the fourth format determined the number of filters in ascending order and the size of the filter based on a small size with 7 layers; the final format was a structure based on AlexNet. As a result, the average accuracy was 88.83%, 87.97%, 89.91%, 90.43%, and 92.03%, respectively. We implemented the CNN structure based on AlexNet to create models for multi-stroke TFSL recognition systems. The experiment was performed using an isolated video of 42 Thai alphabets, which are divided into three categories consisting of one stroke, two strokes, and three strokes. The results presented an 88.00% average accuracy for one stroke, 85.42% for two strokes, and 75.00% for three strokes.

show abstract

Section: Introductionmentioning

confidence: 99%

“…Deep learning is a tool that is increasingly being used in sign language recognition [2,9,10], face recognition [11], object recognition [12], and others. This technology is used for solving complex problems such as object detection [13], image segmentation [14], and image recognition [12].…”

Section: Introductionmentioning

confidence: 99%

Multi-Stroke Thai Finger-Spelling Sign Language Recognition System with Deep Learning

Pariwat

Seresangtakul

2021

Symmetry

View full text Add to dashboard Cite

show abstract

“…A variety of methods for the analysis of sensor data [1]- [4] and the extraction of meaningful patterns from these data have been proposed in recent decades [5]. Data collected by various sensors such as image, voice, electromyography (EMG) and chemical sensors are used for different applications such as image recognition [6]- [8], speech recognition [9], [10], gesture recognition [11]- [14] and gas classification [15]- [20]. The performance of classification techniques using sensor data varies greatly depending not only on the amount of data collected but also on the quality of the data.…”

Section: Introductionmentioning

confidence: 99%

Data Restoration by Linear Estimation of the Principal Components From Lossy Data

Lee

Choi

2020

IEEE Access

View full text Add to dashboard Cite

In this paper, we propose a method based on principal component analysis (PCA) to restore data after the occurrence of data loss due to sensor defects or environmental factors. In the L2-PCA feature space, the feature vector, which consists of principal components of the data, converges to a point known as the "convergence point" as the extent of data loss increases. Using these characteristics, we approximately linearly estimated the principal components of the original data from the feature vectors of the lossy data. The estimated principal components are used as coefficients in the linear combination of the projection vectors of the PCA feature space for data restoration. The restoration performance of the proposed method is not only superior; the method is also computationally more efficient than other data restoration methods. Experimental results for gas measurement data and facial image data confirm the excellent data restoration performance of the proposed method.

show abstract

“…This task presents a challenging problem that has not yet been solved in computer vision and machine learning. Unlike many previous studies [1,2,3,4,5,6,7], which separately have tried to address hand detection or hand gesture recognition, our approach attempts to jointly solve the problem of hand localization and gesture recognition. This task, however, is very challenging, due to the significant variations of hand images in realistic scenarios.…”

Section: Introductionmentioning

confidence: 99%

“…This problem, however, represents a high level of complexity, and retrieving the hand shape is difficult, due to the vast number of hand configurations and variations of the viewpoint with respect to the image sensor. Furthermore, recognizing static hand gestures plays an important role in many applications, such as sign language recognition for deaf and speech-impaired people [6,7], driver hand monitoring and hand gesture commands in order to reduce driver distraction [4,18], an alternative input method for interfacing between human and machines [2,8], in-air writing interaction [19], hand-object interaction in augmented and visual reality environments [20], and many other applications.…”

Section: Introductionmentioning

confidence: 99%

A Deep Learning-Based End-to-End Composite System for Hand Detection and Gesture Recognition

Mohammed

Islam

2019

Sensors

View full text Add to dashboard Cite

Recent research on hand detection and gesture recognition has attracted increasing interest due to its broad range of potential applications, such as human-computer interaction, sign language recognition, hand action analysis, driver hand behavior monitoring, and virtual reality. In recent years, several approaches have been proposed with the aim of developing a robust algorithm which functions in complex and cluttered environments. Although several researchers have addressed this challenging problem, a robust system is still elusive. Therefore, we propose a deep learning-based architecture to jointly detect and classify hand gestures. In the proposed architecture, the whole image is passed through a one-stage dense object detector to extract hand regions, which, in turn, pass through a lightweight convolutional neural network (CNN) for hand gesture recognition. To evaluate our approach, we conducted extensive experiments on four publicly available datasets for hand detection, including the Oxford, 5-signers, EgoHands, and Indian classical dance (ICD) datasets, along with two hand gesture datasets with different gesture vocabularies for hand gesture recognition, namely, the LaRED and TinyHands datasets. Here, experimental results demonstrate that the proposed architecture is efficient and robust. In addition, it outperforms other approaches in both the hand detection and gesture classification tasks.

show abstract

Hand Sign Recognition for Thai Finger Spelling: an Application of Convolution Neural Network

Cited by 42 publications

References 32 publications

Multi-Stroke Thai Finger-Spelling Sign Language Recognition System with Deep Learning

Multi-Stroke Thai Finger-Spelling Sign Language Recognition System with Deep Learning

Data Restoration by Linear Estimation of the Principal Components From Lossy Data

A Deep Learning-Based End-to-End Composite System for Hand Detection and Gesture Recognition

Contact Info

Product

Resources

About