In this study, we conducted a comparative experiment on emotion perception among different cultures. Emotional components were perceived by subjects from Japan, the United States and China, all of whom had no experience living abroad. An emotional speech database without linguistic information was used in this study and evaluated using three-and/or six-emotional dimensions. Principal component analysis (PCA) indicates that the common factors could explain about 60% variance of the data among the three cultures by using a three-emotion description and about 50% variance between Japanese and Chinese cultures by using a six-emotion description. The effects of the emotion categories on perception results were investigated. The emotions of anger, joy and sadness (group 1) have consistent structures in PCA-based spaces when switching from threeemotion categories to six-emotion categories. Disgust, surprise, and fear (group 2) appeared as paired counterparts of anger, joy and sadness, respectively. When investigating the subspaces constructed by these two groups, the similarity between the two emotion groups was found to be fairly high in the two-dimensional space. The similarity becomes lower in 3-or higher dimensional spaces, but not significantly different. The results from this study suggest that a wide range of human emotions might fall into a small subspace of basic emotions.
This paper describes the development of an estimator of perceptual femininity (PF) of an input utterance using speaker recognition techniques. The estimator was designed for its clinical use and the target speakers are Gender Identity Disorder (GID) clients, especially MtF (Male to Female) transsexuals. The voice therapy for MtFs is composed of three kinds of training; 1) raising the baseline F 0 range, 2) changing the baseline voice quality, and 3) enhancing F 0 dynamics to produce an exaggerated intonation pattern. The first two focus on static acoustic properties of speech and the voice quality is mainly controlled by size and shape of the articulators, which can be acoustically characterized by the spectral envelope. Gaussian Mixture Models (GMM) of F 0 values and spectrums were built separately for biologically male speakers and female ones. Using the four models, PF was estimated automatically for each of 142 utterances of 111 MtFs. The estimated values were compared with the PF values obtained through listening tests. Results showed very high correlation (R=0.86), which is comparable to the intra-rater correlation.
This paper proposes a new method of estimating perceptual femininity (PF) of an input utterance using Gaussian Mixture Model (GMM) supervectors and support vector regression (SVR). The method is used to develop a femininity estimation tool, which is introduced to voice therapy of Gender Identity Disorder (GID) clients, especially MtF (Male to Female) transsexuals. In our previous study [1], we developed a PF estimator, where a male GMM and a female GMM of spectral features and those of pitch features were built and their likelihood scores of an input utterance were combined by linear regression to estimate PF. In this work, inspired by recent speaker recognition models [2], we replace the four likelihood scores from the four GMMs with supervectors composed by a spectral GMM and a pitch GMM estimated from an input utterance. Further, instead of simple linear regression, we introduce SVR, which is discriminative linear regression. Experiments using an MtF speech corpus show that the proposed method improves correlation between human and machine scores of PF and also reduces squared prediction error.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.