In this article, we present an automatic technique for recognizing emotional states from speech signals. The main focus of this paper is to present an efficient and reduced set of acoustic features that allows us to recognize the four basic human emotions (anger, sadness, joy, and neutral). The proposed features vector is composed by twenty-eight measurements corresponding to standard acoustic features such as formants, fundamental frequency (obtained by Praat software) as well as introducing new features based on the calculation of the energies in some specific frequency bands and their distributions (thanks to MATLAB codes). The extracted measurements are obtained from syllabic units’ consonant/vowel (CV) derived from Moroccan Arabic dialect emotional database (MADED) corpus. Thereafter, the data which has been collected is then trained by a k-nearest-neighbor (KNN) classifier to perform the automated recognition phase. The results reach 64.65% in the multi-class classification and 94.95% for classification between positive and negative emotions.
<span>Many technology systems have used voice recognition applications to transcribe a speaker’s speech into text that can be used by these systems. One of the most complex tasks in speech identification is to know, which acoustic cues will be used to classify sounds. This study presents an approach for characterizing Arabic fricative consonants in two groups (sibilant and non-sibilant). From an acoustic point of view, our approach is based on the analysis of the energy distribution, in frequency bands, in a syllable of the consonant-vowel type. From a practical point of view, our technique has been implemented, in the MATLAB software, and tested on a corpus built in our laboratory. The results obtained show that the percentage energy distribution in a speech signal is a very powerful parameter in the classification of Arabic fricatives. We obtained an accuracy of 92% for non-sibilant consonants /f, χ, ɣ, ʕ, ћ, and h/, 84% for sibilants /s, sҁ, z, Ӡ and ∫/, and 89% for the whole classification rate. In comparison to other algorithms based on neural networks and support vector machines (SVM), our classification system was able to provide a higher classification rate.</span>
<span>The speech signal is described as many acoustic properties that may contribute differently to spoken word recognition. Vowel characterization is an important process of studying the acoustic characteristics or behaviors of speech within different contexts. This current study focuses on the modulators characteristics of three Arabic vowels, we proposed a new approach to characterize the three Arabic vowels /a/, /i/ and /u/. The proposed method is based on the energy contained in the speech modulators. The coherent subband demodulation method related to the spectral center of gravity (COG) was used to calculate the energy of the speech modulators. The obtained results showed that the modulators energy help characterize the Arabic vowels /a/, /i/ and /u/ with an interesting recognition rate ranging from 86% to 100%.</span>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.