2022
DOI: 10.1109/access.2022.3204677
|View full text |Cite
|
Sign up to set email alerts
|

Lip Reading in Cantonese

Abstract: Lip reading aims at recognizing texts from a talking face without audio information. Due to the rapid development of deep learning techniques, researchers have made giant breakthroughs for both word-level and sentence-level English lip reading in recent years. Unlike English, it is difficult for Chinese to distinguish the lexical meanings, because Chinese is a tonal language. In addition, most of the existing Chinese lip reading datasets are designed for Mandarin, there are few for Cantonese. In this paper, we… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
2
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 20 publications
0
4
0
Order By: Relevance
“…Comprising 50 frequently used Tibetan words, the dataset initially featured 20 native speakers, equally divided by gender, and was later augmented using data enhancement techniques to produce 36,000 videos, averaging 720 videos per sample. Recently, Teng [8] propose a word-level Cantonese lip-reading dataset called CLRW which contains 800word classes with 400,000 samples. Dai et al [29] introduced a new Cantonese in-car audio-visual speech recognition (CI-AVSR) dataset for in-car command recognition in Cantonese.…”
Section: Lip-reading Datasetsmentioning
confidence: 99%
See 2 more Smart Citations
“…Comprising 50 frequently used Tibetan words, the dataset initially featured 20 native speakers, equally divided by gender, and was later augmented using data enhancement techniques to produce 36,000 videos, averaging 720 videos per sample. Recently, Teng [8] propose a word-level Cantonese lip-reading dataset called CLRW which contains 800word classes with 400,000 samples. Dai et al [29] introduced a new Cantonese in-car audio-visual speech recognition (CI-AVSR) dataset for in-car command recognition in Cantonese.…”
Section: Lip-reading Datasetsmentioning
confidence: 99%
“…The CLRS, CI-AVSR [29], and CLRW [8] datasets are all significant resources in the field of lipreading. The CLRS dataset is a multimodal corpus tailored for Cantonese sentence-level lipreading, comprising over 30,000 natural Cantonese sentences and recordings from more than 1,000 speakers.…”
Section: Comparison With Other Cantonese Datasetsmentioning
confidence: 99%
See 1 more Smart Citation
“…Several approaches related to lip reading are briefly addressed in this paper. The authors [3] presented the method of detecting lips and using the cropped images as a dataset for the training set for Convolutional Neural Networks. Also, they discussed different methods of evaluation that can be used.…”
Section: Related Workmentioning
confidence: 99%