This study introduces a hybrid model designed for the predic- tion of emotional states derived from electroencephalogram (EEG) data, employing an amalgamation of convolutional and trans- former layers. The architectural framework of the model is metic- ulously structured to facilitate concurrent assimilation of local pat- terns and long-range dependencies inherent in EEG data, thereby augmenting its discernment of nuanced emotional experiences. The investigation presented herein undertakes a comprehensive explo- ration of the fusion technique, with a primary focus on dis- cerning three elemental emotional dimensions: Arousal, Valence, and Dominance, in addition to their concurrent combinations. The research methodology encompasses an in-depth evaluation of the model’s performance across these diverse emotional states, encompassing the intricate task of simultaneous Valence-Arousal (VA) prediction. Furthermore, the study extends its purview to encompass the intricate Valence-Arousal-Dominance (VAD) space, thereby providing a thorough analysis of the model’s efficacy. To articulate the model’s discriminative efficacy, this study meticu- lously presents the detailed F1 scores corresponding to each emotional state classification: Arousal (96.8), Valence (97.3), Valence-Arousal (VA) simultaneously (95.6), and Valence-Arousal-Dominance simultaneously (94.9). These scores serve as a testament to the model’s robust per- formance across diverse emotional dimensions. Importantly, to fortify the credibility of our findings, rigorous experimentation has been con- ducted on the DEAP dataset, unveiling noteworthy results even in scenarios involving simultaneous recognition of multiple emotional states.