2022
DOI: 10.1101/2022.08.02.502503
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Direct Speech Reconstruction from Sensorimotor Brain Activity with Optimized Deep Learning Models

Abstract: Development of brain-computer interface (BCI) technology is key for enabling communication in individuals who have lost the faculty of speech due to severe motor paralysis. A BCI control strategy that is gaining attention employs speech decoding from neural data. Recent studies have shown that a combination of direct neural recordings and advanced computational models can provide promising results. Understanding which decoding strategies deliver best and directly applicable results is crucial for advancing the… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(5 citation statements)
references
References 71 publications
0
4
0
1
Order By: Relevance
“…For the Pearson correlation, each mel-scaled spectral bin is correlated over time individually and then the average is taken. While the Pearson correlation is not a perfect measure of speech quality, it was recently shown to better correlate with intelligibility then other measures [46]. In comparison to existing works, our approach outperforms the baseline from [16] both in terms of mean-squared-error loss (Tab.…”
Section: Speech Can Be Generated From Intracranial Depth Electrodesmentioning
confidence: 82%
“…For the Pearson correlation, each mel-scaled spectral bin is correlated over time individually and then the average is taken. While the Pearson correlation is not a perfect measure of speech quality, it was recently shown to better correlate with intelligibility then other measures [46]. In comparison to existing works, our approach outperforms the baseline from [16] both in terms of mean-squared-error loss (Tab.…”
Section: Speech Can Be Generated From Intracranial Depth Electrodesmentioning
confidence: 82%
“…-Resulta esencial entrenar cuidadosamente los modelos de aprendizaje automático para la reconstrucción del habla natural de cara a obtener los mejores resultados. En este sentido, la decodificación basada en la reconstrucción a partir del córtex sensoriomotor podría ser un enfoque prometedor para desarrollar la próxima generación de tecnología BCI para la comunicación 31 .…”
Section: F) El Sector Del Neuromarketingunclassified
“…Combining various designs proposed in previous studies [41][42][43], three separate tasks were implemented in the following order: syllable counting (60 sounds), word recognition (60 sounds), and gender recognition (24 sounds). In the first task, volunteers reported the perceived number of syllables, choosing between 1 and 5.…”
Section: Subjective Evaluation Of Synthesis Performancementioning
confidence: 99%
“…In terms of model architecture, LDA classifiers combined with spectrogram inversion constituted the most practical workaround to overcome our clinical constraints and minimize model calibration time at the patient's bedside. However, as outlined in a review [54], several recent studies have proposed deep learning approaches that significantly improved the quality of artificial sounds synthesized from intracranial recordings [41,43,50,55,56]. These deep neural networks have not yet been applied both in real time and during silent speech production, but there is no doubt that they would lead to more intelligible artificial sounds.…”
Section: Feature Selection and Model Architecture Can Be Optimizedmentioning
confidence: 99%