Most models of automatic emotion recognition use a discrete perspective and a black-box approach, i.e., they output an emotion label chosen from a limited pool of candidate terms, on the basis of purely statistical methods. Although these models are successful in emotion classification, a number of practical and theoretical drawbacks limit the range of possible applications. In this paper, the authors suggest the adoption of an appraisal perspective in modeling emotion recognition. The authors propose to use appraisals as an intermediate layer between expressive features (input) and emotion labeling (output). The model would then be made of two parts: first, expressive features would be used to estimate appraisals; second, resulting appraisals would be used to predict an emotion label. While the second part of the model has already been the object of several studies, the first is unexplored. The authors argue that this model should be built on the basis of both theoretical predictions and empirical results about the link between specific appraisals and expressive features. For this purpose, the authors suggest to use the component process model of emotion, which includes detailed predictions of efferent effects of appraisals on facial expression, voice, and body movements.
DOI: 10.4018/jse.2012010102International Journal of Synthetic Emotions, 3(1), 18-32, January-June 2012 19 Copyright © 2012, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. Copyright © 2012, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. Mehu, Pantic, & Scherer, in press). Current results are promising and we can expect that in the near future these systems will become fully reliable and perform in a satisfactory way. As the detection problem is getting solved, attention should now focus on what is the best model to attribute an emotional meaning 1 . Indeed, emotion recognition systems can be conceived as made of two parts, a detection component and an inference component. The detection component performs the analysis of the facial movements; the inference component outputs the attribution of an emotional meaning to the movements detected by the first component. While for the detection component there is one recognized standard (FACS), for the inference component we have to turn to emotion psychology where multiple theoretical models currently co-exist. Most researchers in affective computing choose a pragmatic approach and avoid theoretical controversies, but every system necessarily implies theoretical assumptions (Calvo & D'Mello, 2010). In the next paragraphs we will first present different theoretical models of emotion and discuss their use for automatic emotion recognition. The goal is not to provide an exhaustive review of the available systems, but rather a brief description of the pros and cons of each choice. We will then introduce a specific componential appraisal model of emot...