Nowadays, deep learning and convolutional neural networks (CNNs) have become widespread tools in many biomedical engineering studies. CNN is an end-to-end tool which makes processing procedure integrated, but in some situations, this processing tool requires to be fused with machine learning methods to be more accurate. In this paper, a hybrid approach based on deep features extracted from Wavelet CNNs (WCNNs) weighted layers and multiclass support vector machine (MSVM) is proposed to improve recognition of emotional states from electroencephalogram (EEG) signals. First, EEG signals were preprocessed and converted to time-frequency (T-F) color representation or scalogram using the continuous wavelet transform (CWT) method. Then, scalograms were fed into four popular pre-trained CNNs, AlexNet, ResNet-18, VGG-19 and Inception-v3 to fine-tune them. Then, the best feature layer from each one was used as input to the MSVM method to classify four quarters of the valence-arousal model. Finally, subject-independent Leave-One-Subject-Out criterion was used to evaluate the proposed method on DEAP and MAHNOB-HCI databases. Results show that extracting deep features from the earlier convolutional layer of ResNet-18 (Res2a) and classifying using the MSVM increases the average accuracy, precision and recall about 20% and 12% for MAHNOB-HCI and DEAP databases, respectively. Also, combining scalograms from four regions of pre-frontal, frontal, parietal and parietal-occipital and two regions of frontal and parietal achieved the higher average accuracy of 77.47% and 87.45% for MAHNOB-HCI and DEAP databases, respectively. Combining CNN and MSVM increased recognition of emotion from EEG signal and results were comparable to state-of-the-art studies.