The involvement of emotional states in intelligent spoken human-computer interfaces has evolved to a recent field of research. In this article we describe the enhancements and optimizations of a speech-based emotion recognizer jointly operating with automatic speech recognition. We argue that the knowledge about the textual content of an utterance can improve the recognition of the emotional content. Having outlined the experimental setup we present results and demonstrate the capability of a postprocessing algorithm combining multiple speech-emotion recognizers. For the dialogue management we propose a stochastic approach comprising a dialogue model and an emotional model interfering with each other in a combined dialogue-emotion model. These models are trained from dialogue corpora and being assigned different weighting factors they determine the course of the dialogue.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.