2020 IEEE Winter Conference on Applications of Computer Vision (WACV) 2020
DOI: 10.1109/wacv45572.2020.9093512
|View full text |Cite
|
Sign up to set email alerts
|

Word-level Deep Sign Language Recognition from Video: A New Large-scale Dataset and Methods Comparison

Abstract: Vision-based sign language recognition aims at helping the hearing-impaired people to communicate with others. However, most existing sign language datasets are limited to a small number of words. Due to the limited vocabulary size, models learned from those datasets cannot be applied in practice. In this paper, we introduce a new largescale Word-Level American Sign Language (WLASL) video dataset, containing more than 2000 words performed by over 100 signers. This dataset will be made publicly available to the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
282
0
2

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
3

Relationship

1
5

Authors

Journals

citations
Cited by 360 publications
(285 citation statements)
references
References 69 publications
1
282
0
2
Order By: Relevance
“…In parallel to the success of the deep learning based models in other domains, many works in the SLR domain recently conduct research using deep neural networks. In these approaches, instead of hand-crafted feature extraction, Convolutional Neural Networks (CNNs) are utilized effectively [1], [3], [4], [10], [15], [34]- [37]. While some of these studies do not require any segmentation methods [1], [3], [4], [35], some studies prefer to use neural networks, such as Fast R-CNN and Faster R-CNN, in order to locate the hand region [15], [34], [36].…”
Section: Related Workmentioning
confidence: 99%
See 4 more Smart Citations
“…In parallel to the success of the deep learning based models in other domains, many works in the SLR domain recently conduct research using deep neural networks. In these approaches, instead of hand-crafted feature extraction, Convolutional Neural Networks (CNNs) are utilized effectively [1], [3], [4], [10], [15], [34]- [37]. While some of these studies do not require any segmentation methods [1], [3], [4], [35], some studies prefer to use neural networks, such as Fast R-CNN and Faster R-CNN, in order to locate the hand region [15], [34], [36].…”
Section: Related Workmentioning
confidence: 99%
“…It provides only 483 RGB samples in total. An extended list of sign language datasets can be found in [3], [42]. Montalbano Italian gesture dataset [43], which has recently become one of the most widely used isolated SLR datasets, contains 20 gestures and approximately 14,000 samples in total.…”
Section: A Sign Language Datasetsmentioning
confidence: 99%
See 3 more Smart Citations