2021
DOI: 10.1145/3494987
|View full text |Cite
|
Sign up to set email alerts
|

SpeeChin

Abstract: This paper presents SpeeChin, a smart necklace that can recognize 54 English and 44 Chinese silent speech commands. A customized infrared (IR) imaging system is mounted on a necklace to capture images of the neck and face from under the chin. These images are first pre-processed and then deep learned by an end-to-end deep convolutional-recurrent-neural-network (CRNN) model to infer different silent speech commands. A user study with 20 participants (10 participants for each language) showed that SpeeChin could… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 42 publications
(5 citation statements)
references
References 51 publications
0
5
0
Order By: Relevance
“…Lipreading is a technology that utilizes a camera to visually capture movement around the mouth and interpret speech from the image sequence. HCI researchers have proposed to use devices such as smartphones [45,58] and wearable cameras [6,34,69] to provide mobile silent speech interaction, as well as multimodal approaches such as using silent speech to facilitate eye-gaze-based selection [57].…”
Section: Silent Speech Interfacementioning
confidence: 99%
See 3 more Smart Citations
“…Lipreading is a technology that utilizes a camera to visually capture movement around the mouth and interpret speech from the image sequence. HCI researchers have proposed to use devices such as smartphones [45,58] and wearable cameras [6,34,69] to provide mobile silent speech interaction, as well as multimodal approaches such as using silent speech to facilitate eye-gaze-based selection [57].…”
Section: Silent Speech Interfacementioning
confidence: 99%
“…Additionally, there is a lack of a practical activating method to initiate silent speech input. Previous methods such as offline segmentation [6,34,69] or trigger buttons [45,52] are not feasible for hands-free real-time interactions, and MOD-based methods can be vulnerable to misactivations [57,58]. We propose a novel few-shot transfer learning paradigm to enable customizable silent speech commands.…”
Section: Machine Learning Approaches To Lipreading Interfacesmentioning
confidence: 99%
See 2 more Smart Citations
“…Therefore, a wide variety of indirect secondary carriers of information about voice commands have been explored. Many of these techniques achieve high accuracy at the price of being highly invasive because they rely on placing sensors (e.g., magnetic, [ 2 ] surface electromyographic, [ 3 ] infrared, [ 4 ] electropalatographic, [ 5 ] electromagnetic [ 6 , 7 ] ) directly on the human's body to detect subtle vibrations that are correlated with the speech production. Obviously, such contact‐based approaches are oftentimes inconvenient and, moreover, incompatible with large‐scale deployment in our daily lives.…”
Section: Introductionmentioning
confidence: 99%