The recognition of emotions is a vast significance and a high developing field of research in the recent years. The applications of emotion recognition have left an exceptional mark in various fields including education and research. Traditional approaches used facial expressions or voice intonation to detect emotions, however, facial gestures and spoken language can lead to biased and ambiguous results. This is why, researchers have started to use electroencephalogram (EEG) technique which is well defined method for emotion recognition. Some approaches used standard and pre-defined methods of the signal processing area and some worked with either fewer channels or fewer subjects to record EEG signals for their research. This paper proposed an emotion detection method based on time-frequency domain statistical features. Box-and-whisker plot is used to select the optimal features, which are later feed to SVM classifier for training and testing the DEAP dataset, where 32 participants with different gender and age groups are considered. The experimental results show that the proposed method exhibits 92.36% accuracy for our tested dataset. In addition, the proposed method outperforms than the state-of-art methods by exhibiting higher accuracy.
Extensive research has been conducted in the past to determine age, gender, and words spoken in Bangla speech, but no work has been conducted to identify the regional language spoken by the speaker in Bangla speech. Hence, in this study, we create a dataset containing 30 h of Bangla speech of seven regional Bangla dialects with the goal of detecting synthesized Bangla speech and categorizing it. To categorize the regional language spoken by the speaker in the Bangla speech and determine its authenticity, the proposed model was created; a Stacked Convolutional Autoencoder (SCAE) and a Sequence of Multi-Label Extreme Learning machines (MLELM). SCAE creates a detailed feature map by identifying the spatial and temporal salient qualities from MFEC input data. The feature map is then sent to MLELM networks to generate soft labels and then hard labels. As aging generates physiological changes in the brain that alter the processing of aural information, the model took age class into account while generating dialect class labels, increasing classification accuracy from 85% to 95% without and with age class consideration, respectively. The classification accuracy for synthesized Bangla speech labels is 95%. The proposed methodology works well with English speaking audio sets as well.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.