Objective. Decoding language representations directly from the brain can enable new brain–computer interfaces (BCIs) for high bandwidth human–human and human–machine communication. Clinically, such technologies can restore communication in people with neurological conditions affecting their ability to speak. Approach. In this study, we propose a novel deep network architecture Brain2Char, for directly decoding text (specifically character sequences) from direct brain recordings (called electrocorticography, ECoG). Brain2Char framework combines state-of-the-art deep learning modules—3D Inception layers for multiband spatiotemporal feature extraction from neural data and bidirectional recurrent layers, dilated convolution layers followed by language model weighted beam search to decode character sequences, and optimizing a connectionist temporal classification loss. Additionally, given the highly non-linear transformations that underlie the conversion of cortical function to character sequences, we perform regularizations on the network’s latent representations motivated by insights into cortical encoding of speech production and artifactual aspects specific to ECoG data acquisition. To do this, we impose auxiliary losses on latent representations for articulatory movements, speech acoustics and session specific non-linearities. Main results. In three (out of four) participants reported here, Brain2Char achieves 10.6%, 8.5%, and 7.0% word error rates respectively on vocabulary sizes ranging from 1200 to 1900 words. Significance. These results establish a new end-to-end approach on decoding text from brain signals and demonstrate the potential of Brain2Char as a high-performance communication BCI.
In this paper, we propose an end-to-end neural network (NN) based EEG-speech (NES) modeling framework, in which three network structures are developed to map imagined EEG signals to phonemes. The proposed NES models incorporate a language model based EEG feature extraction layer, an acoustic feature mapping layer, and a restricted Boltzmann machine (RBM) based the feature learning layer. The NES models can jointly realize the representation of multichannel EEG signals and the projection of acoustic speech signals. Among three proposed NES models, two augmented networks utilize spoken EEG signals as either bias or gate information to strengthen the feature learning and translation of imagined EEG signals. Experimental results show that all three proposed NES models outperform the baseline support vector machine (SVM) method on EEGspeech classification. With respect to binary classification, our approach achieves comparable results relative to deep believe network approach.
In this letter, we propose an online estimated dictionary based single-channel speech enhancement algorithm, which focuses on low-rank and sparse matrix decomposition. In the proposed algorithm, a noisy speech spectrogram can be decomposed into low rank background noise components and an activation of the online speech dictionary, on which both low-rank and sparsity constraints are imposed. This decomposition takes the advantage of local estimated exemplar's high expressiveness on speech components and also accommodates nonstationary background noise. The local dictionary can be obtained through estimating the speech presence probability (SPP) by applying Expectation-Maximal algorithm, in which a generalized Gamma prior for speech magnitude spectrum is used. The proposed algorithm is evaluated using signal-to-distortion ratio (SDR), and perceptual evaluation of speech quality (PESQ).The results show that the proposed algorithm achieves significant improvements at various SNRs when compared to four other speech enhancement algorithms, including improved Karhunen-Loeve transform (KLT) approach, SPP based MMSE (MMSE-SPP), NMF based RPCA (NMF-RPCA), and RPCA.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.