2017
DOI: 10.1109/taslp.2017.2758999
|View full text |Cite
|
Sign up to set email alerts
|

Speaker-Independent Silent Speech Recognition From Flesh-Point Articulatory Movements Using an LSTM Neural Network

Abstract: Silent speech recognition (SSR) converts non-audio information such as articulatory movements into text. SSR has the potential to enable persons with laryngectomy to communicate through natural spoken expression. Current SSR systems have largely relied on speaker-dependent recognition models. The high degree of variability in articulatory patterns across different speakers has been a barrier for developing effective speaker-independent SSR approaches. Speaker-independent SSR approaches, however, are critical f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

1
63
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 84 publications
(72 citation statements)
references
References 45 publications
1
63
0
Order By: Relevance
“…This has the main idea of recording the soundless articulatory movement, and automatically generating speech from the movement information, while the subject is not producing any sound. For this automatic conversion task, typically electromagnetic articulography (EMA) [2,3,4,5], ultrasound tongue imaging (UTI) [6,7,8,9,10,11,12,13], permanent magnetic articulography (PMA) [14,15], surface electromyography (sEMG) [16,17,18], Non-Audible Murmur (NAM) [19] or video of the lip movements [7,20] are used.…”
Section: Introductionmentioning
confidence: 99%
“…This has the main idea of recording the soundless articulatory movement, and automatically generating speech from the movement information, while the subject is not producing any sound. For this automatic conversion task, typically electromagnetic articulography (EMA) [2,3,4,5], ultrasound tongue imaging (UTI) [6,7,8,9,10,11,12,13], permanent magnetic articulography (PMA) [14,15], surface electromyography (sEMG) [16,17,18], Non-Audible Murmur (NAM) [19] or video of the lip movements [7,20] are used.…”
Section: Introductionmentioning
confidence: 99%
“…A functional rehabilitation protocol and an optimal therapeutic education program need to be integrated into the multidisciplinary management to further improve the QoL of laryngectomized patients [62,63]. Among the promising functional rehabilitation methods, silent speech interfaces (SSIs), which represent a novel technological paradigm, have the potential to provide an alternative way to assist laryngectomized patients to produce speech with natural-sounding voices from the movements of their articulators, such as the tongue and lips [64]. SSIs typically include an articulatory movement recorder, a silent speech recognizer, and a text-to-speech synthesizer.…”
Section: Functional Results and Quality Of Life After Total Laryngectomymentioning
confidence: 99%
“…SSIs typically include an articulatory movement recorder, a silent speech recognizer, and a text-to-speech synthesizer. This technology could offer the opportunity to significantly improve the voice-related QoL of patients undergoing TL [64].…”
Section: Functional Results and Quality Of Life After Total Laryngectomymentioning
confidence: 99%
“…In the second case, silent speech recognition (SSR) is applied on the biosignal which extracts the content spoken by the person (i.e., the result is text). This step is then followed by text-to-speech (TTS) synthesis [3], [4], [11]- [13], [15], [18], [24], [26]. A drawback of the SSR+TTS approach might be that the errors made by the SSR component inevitably appear as errors in the final TTS output [30], and also that it causes a significant end-to-end delay.…”
Section: Introductionmentioning
confidence: 99%