Modeling the relationship between natural speech and a recorded electroencephalogram (EEG) helps us understand how the brain processes speech and has various applications in neuroscience and brain-computer interfaces. In this context, so far mainly linear models have been used. However, the decoding performance of the linear model is limited due to the complex and highly non-linear nature of the auditory processing in the human brain. We present a novel Long Short-Term Memory (LSTM)-based architecture as a nonlinear model for the classification problem of whether a given pair of (EEG, speech envelope) correspond to each other or not. The model maps short segments of the EEG and the envelope to a common embedding space using a CNN in the EEG path and an LSTM in the speech path. The latter also compensates for the brain response delay. In addition, we use transfer learning to fine-tune the model for each subject. The mean classification accuracy of the proposed model reaches 85%, which is significantly higher than that of a state of the art Convolutional Neural Network (CNN)-based model (73%) and the linear model (69%).
Measurement of neural tracking of natural running speech from the 15 electroencephalogram (EEG) is an increasingly popular method in auditory neuroscience 16 and has applications in audiology. The method involves decoding the envelope of the 17 speech signal from the EEG signal, and calculating the correlation with the envelope 18 that was presented to the subject. Typically EEG systems with 64 or more electrodes 19 are used. However, in practical applications, set-ups with fewer electrodes are required. 20 Here, we determine the optimal number of electrodes, and the best position to place 21 a limited number of electrodes on the scalp. We propose a channel selection strategy, 22 aiming to induce the selection of symmetric EEG channel groups in order to avoid 23 hemispheric bias. The proposed method is based on a utility metric, which allows 24 a quick quantitative assessment of the influence of each group of EEG channels on 25 the reconstruction error. We consider two use cases: a subject-specific case, where 26 the optimal number and positions of the electrodes is determined for each subject 27 individually, and a subject-independent case, where the electrodes are placed at the same 28 positions (in the 10-20 system) for all the subjects. We evaluated our approach using 64-29 channel EEG data from 90 subjects. Surprisingly, in the subject-specific case we found 30 that the correlation between actual and reconstructed envelope first increased with 31 decreasing number of electrodes, with an optimum at around 20 electrodes, yielding 38% 32 higher correlations using the optimal number of electrodes. In the subject-independent 33 case, we obtained a stable decoding performance when decreasing from 64 to 32 channels. 34When the number of channels was further decreased, the correlation decreased. For 35 a maximal decrease in correlation of 10%, 32 well-placed electrodes were sufficient in 36 87% of the subjects. Practical electrode placement recommendations are given for 8, 37 16, 24 and 32 electrode systems. 38 42 also has applications in domains such as audiology, as part of an objective measure 43 of speech intelligibility (Vanthornhout et al., 2018; Lesenfants et al., 2019), and coma 44 science (Braiman et al., 2018). 45The relationship between the stimulus and the brain response can be studied using 46 two different models (e.g., Crosse et al., 2016; Lalor and Foxe, 2010; Ding and Simon, 47 2012; Verschueren et al., 2019; Vanthornhout et al., 2018): in the forward model (also 48 know as encoding model), we determine a linear mapping from the stimulus to the 49 brain response. On the other hand, in the backward model (also known as stimulus 50 reconstruction), we determine the linear mapping from the brain response to the stimulus. 51Backward models are referred to as decoding models, because they attempt to reverse 52 the data generation process. Both the forward and backward models involve the solution 53 of a linear least squares (LS) regression problem. The quality of the reconstruction is 54 ...
Measurement of neural tracking of natural running speech from the electroencephalogram (EEG) is an increasingly popular method in auditory neuroscience and has applications in audiology. The method involves decoding the envelope of the speech signal from the EEG signal, and calculating the correlation with the envelope of the audio stream that was presented to the subject. Typically EEG systems with 64 or more electrodes are used. However, in practical applications, set-ups with fewer electrodes are required. Here, we determine the optimal number of electrodes, and the best position to place a limited number of electrodes on the scalp. We propose a channel selection strategy based on an utility metric, which allows a quick quantitative assessment of the influence of a channel (or a group of channels) on the reconstruction error. We consider two use cases: a subject-specific case, where the optimal number and position of the electrodes is determined for each subject individually, and a subject-independent case, where the electrodes are placed at the same positions (in the 10-20 system) for all the subjects. We evaluated our approach using 64-channel EEG data from 90 subjects. In the subject-specific case we found that the correlation between actual and reconstructed envelope first increased with decreasing number of electrodes, with an optimum at around 20 electrodes, yielding 29% higher correlations using the optimal number of electrodes compared to all electrodes. This means that our strategy of removing electrodes can be used to improve the correlation metric in high-density EEG recordings. In the subject-independent case, we obtained a stable decoding performance when decreasing from 64 to 22 channels. When the number of channels was further decreased, the correlation decreased. For a maximal decrease in correlation of 10%, 32 well-placed electrodes were sufficient in 91% of the subjects.
We consider the estimation of the Brain Electrical Sources (BES) matrix from noisy electroencephalographic (EEG) measurements, commonly named as the EEG inverse problem. We propose a new method to induce neurophysiological meaningful solutions, which takes into account the smoothness, structured sparsity, and low rank of the BES matrix. The method is based on the factorization of the BES matrix as a product of a sparse coding matrix and a dense latent source matrix. The structured sparse-low-rank structure is enforced by minimizing a regularized functional that includes the 21 -norm of the coding matrix and the squared Frobenius norm of the latent source matrix. We develop an alternating optimization algorithm to solve the resulting nonsmooth-nonconvex minimization problem. We analyze the convergence of the optimization procedure, and we compare, under different synthetic scenarios, the performance of our method with respect to the Group Lasso and Trace Norm regularizers when they are applied directly to the target matrix.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.