An investigation on the effect of mental activity in quality perception is presented using simultaneous measurement of electroencephalography (EEG) and functional near‐infrared spectroscopy (fNIRS), in a subject‐independent approach. Building a subject‐independent model is a harder problem due to noise and high EEG variability between individuals, correlated components analysis (CorrCA) have been proposed to extract significant correlated components for a single subject that experiences multiple identical trials; this is done by identifying spatio‐temporal patterns of activity that are well preserved across trials. The aim is to build a model based on neurophysiological data to assess text‐to‐speech quality. In order to build a subject independent model, we extended the use of CorrCA such that it can be applied to the subject independent model. The authors used two preprocessing steps, namely the subject dependent and the stimulus dependent preprocessing. The second preprocessing used the denoising source separation (DSS) to remove noise/artefact that are subject specific. The discrete convolution is used for data fusion and the support vector machine for regression. With the proposed model, the fusion of EEG and fNIRS performs better than single modality. Using our defined regression accuracy metrics, the authors obtained accuracy of 81.346% for overall impression, 83.28% for valence and 89.714% for arousal. The model compete the baseline that is subject dependent.