2015
DOI: 10.1007/978-3-319-16634-6_18
|View full text |Cite
|
Sign up to set email alerts
|

Curve Matching from the View of Manifold for Sign Language Recognition

Abstract: Abstract. Sign language recognition is a challenging task due to the complex action variations and the large vocabulary set. Generally, sign language conveys meaning through multichannel information like trajectory, hand posture and facial expression simultaneously. Obviously, trajectories of sign words play an important role for sign language recognition. Although the multichannel features are helpful for sign representation, this paper only focuses on the trajectory aspect. A method of curve matching based o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
5
2
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 12 publications
(6 citation statements)
references
References 21 publications
0
6
0
Order By: Relevance
“…Oliveira et al [18] used two handed data gloves to capture sign language movements and fed them into neural networks for recognition, achieving recognition of English words. Lin et al [19] used cameras to obtain data on people wearing colored gloves and performed data preprocessing such as color segmentation on these image data. Although sensor based sign language recognition has made significant progress, these devices require sign language performers to comply with specific wearing requirements, making the entire process cumbersome.…”
Section: Related Workmentioning
confidence: 99%
“…Oliveira et al [18] used two handed data gloves to capture sign language movements and fed them into neural networks for recognition, achieving recognition of English words. Lin et al [19] used cameras to obtain data on people wearing colored gloves and performed data preprocessing such as color segmentation on these image data. Although sensor based sign language recognition has made significant progress, these devices require sign language performers to comply with specific wearing requirements, making the entire process cumbersome.…”
Section: Related Workmentioning
confidence: 99%
“…TH refers to a virtual or digital representation of the human face or head, typically used in multimedia applications, computer [78] 2012 ∼3K Video-Text English Link SIGNUM [79] 2013 ∼33K Video-Text Germany Link DEVISIGN [80] 2014 ∼24k Video-Text Chinese Link ASL-LEX 1.0 [81] 2017 ∼1K Video-Text English Link PHOENIX14T [82] 2018 ∼68K Video-Text Germany Link CMLR [83] 2019 ∼102K Image-Text Chinese Link KETI [84] 2019 ∼15K Video-Text Korean Not Available GSL [85] 2020 ∼3K Video-Text Greek Link ASL-LEX 2.0 [86] 2021 ∼ 10K Video-Text-Depth English Link How2sign [87] 2021 ∼35K Video-Text-Skelton(2D)-Depth English Link Slovo [88] 2023 ∼20K Video-Text Russian Link AASL [89] 2023 ∼8K Image-Text Arabic Link ASL-27C [83] 2023 ∼23K Image-Text English Link Cued Speech FCS [90] 2018 ∼13k Video-Text-Audio French Link BEC [59] 2019 ∼3k Video-Text-Audio English Link PCSC [91] 2020 20 (P) Video-Text-Audio Polish Link CLeLfPC [92] 2022 350 Video-Text-Audio French Link MCCS-2023 [41] 2023 ∼132k Video-Text-Audio-Skelton(2,3D) Chinese Link…”
Section: Talking Headmentioning
confidence: 99%
“…Since the three-dimensional convolutional neural network only reduces the image in a small proportion in the spatial domain, and the convolutional long short-term memory recursive network does not change the spatial size of the feature map, the final long-term spatiotemporal feature vector has a relatively high spatial size (the size of the long-term spatiotemporal eigenvector is 28×28 in our network due to the input size of the 3D convolutional neural network is 112 × 112). A spatial pyramid pool [20] was inserted between the convolutional long short-term memory recurrent network and the fully connected layer to reduce the size. Therefore the final fully connected layer can have fewer parameters.…”
Section: Architecture Of Our Feature Extraction Networkmentioning
confidence: 99%