The following paper introduces a group of novel speech-signal descriptors that reflect phoneme-pronunciation variability and that can be considered as potentially useful features for emotion sensing. The proposed group includes a set of statistical parameters of Poincare maps, derived for formant-frequency evolution and energy evolution of voiced-speech segments. Two groups of Poincare-map characteristics were considered in the research: descriptors of sample-scatter, which reflect magnitudes of phone-uttering variations and descriptors of cross-correlations that exist among samples and that evaluate consistency of variations. It has been shown that inclusion of the proposed characteristics into the pool of commonly used speech descriptors, results in a noticeable increase—at the level of 10%—in emotion sensing performance. Standard pattern recognition methodology has been adopted for evaluation of the proposed descriptors, with the assumption that three- or four-dimensional feature spaces can provide sufficient emotion sensing. Binary decision trees have been selected for data classification, as they provide with detailed information on emotion-specific discriminative power of various speech descriptors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.