2016
DOI: 10.48550/arxiv.1612.05369
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Neural networks based EEG-Speech Models

Abstract: In this paper, we propose an end-to-end neural network (NN) based EEG-speech (NES) modeling framework, in which three network structures are developed to map imagined EEG signals to phonemes. The proposed NES models incorporate a language model based EEG feature extraction layer, an acoustic feature mapping layer, and a restricted Boltzmann machine (RBM) based the feature learning layer. The NES models can jointly realize the representation of multichannel EEG signals and the projection of acoustic speech sign… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
25
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(25 citation statements)
references
References 21 publications
0
25
0
Order By: Relevance
“…DNN-based models triggered a revolution in terms of results on the main benchmark datasets such as MIT benchmark [10] where DNN-based saliency models definitely outperformed classical models. The DNN-based models have been already used in several applications such as image and video processing, medical signal processing or big data analysis [11], [12], [13], [14], [15]. Some of the DNN-based models became new references such as SALICON [16], MLNet [17] or SAM-ResNet [18].…”
Section: Visual Attention: Deep Learning Troublementioning
confidence: 99%
“…DNN-based models triggered a revolution in terms of results on the main benchmark datasets such as MIT benchmark [10] where DNN-based saliency models definitely outperformed classical models. The DNN-based models have been already used in several applications such as image and video processing, medical signal processing or big data analysis [11], [12], [13], [14], [15]. Some of the DNN-based models became new references such as SALICON [16], MLNet [17] or SAM-ResNet [18].…”
Section: Visual Attention: Deep Learning Troublementioning
confidence: 99%
“…Multiple works on silent speech recognition deal with unimodal single-phase imagined speech EEG(is-EEG) of vowels, syllables or words [21,24,25]. Recently multi-modal and multiphasal data inclusion has proved to be beneficial in boosting is-EEG recognition rates [26,27]. Multi-modal information includes recording facial video and speech audio alongside EEG and multi-phasal refers to the EEG data recorded during different phases of subject activities like speech production, imagination or articulation.…”
Section: Motivation and Related Workmentioning
confidence: 99%
“…Multi-modal information includes recording facial video and speech audio alongside EEG and multi-phasal refers to the EEG data recorded during different phases of subject activities like speech production, imagination or articulation. In addition to speech audio, [27] also uses bi-phasal data corresponding to spoken-EEG and is-EEG simultaneously to decode is-EEG and provides proof of correlation between these phases by visualizing their spectrum based scalp distributions. However, the equipment requirements for multi-modal data recordings, and their syncing lead to inconvenience of deployment in real-time BCIs.…”
Section: Motivation and Related Workmentioning
confidence: 99%
“…In [6] authors demonstrate speech recognition using electrocorticography (ECoG) signals, which are invasive in nature but in our work we use non invasive EEG signals. This work is mainly motivated by the results explained in [1], [4], [7], [8]. In [7] the authors used classification approach for identifying phonological categories in imagined and silent speech but in our work we used continuous speech recognition state of art models and our models were predicting words, characters at each time step.…”
Section: Introductionmentioning
confidence: 99%
“…In [7] the authors used classification approach for identifying phonological categories in imagined and silent speech but in our work we used continuous speech recognition state of art models and our models were predicting words, characters at each time step. Similarly in [8] neural network based classification approach was used for predicting phonemes.…”
Section: Introductionmentioning
confidence: 99%