Toward multimodal fusion of affective cues

Paleari, Marco; Lisetti, Christine L.

doi:10.1145/1178745.1178762

Cited by 36 publications

(23 citation statements)

References 42 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Facial Animation Parameters -FAPs are extracted from video data and are used together with low level audio features as input for a HMM to classify the human emotions. The paper of (Paleari and Lisetti, 2006) presents a multimodal fusion framework for emotion recognition that relies on MAUI -Multimodal Affective User Interface paradigm. The approach is based on the Scherer's theory Component Process Theory (CPT) for the definition of the user model and to simulate the agent emotion generation.…”

Section: Related Workmentioning

confidence: 99%

Semantic Audiovisual Data Fusion for Automatic Emotion Recognition

Datcu¹,

Rothkrantz²

2015

Emotion Recognition

View full text Add to dashboard Cite

The paper describes a novel technique for the recognition of emotions from multimodal data. We focus on the recognition of the six prototypic emotions. The results from the facial expression recognition and from the emotion recognition from speech are combined using a bi-modal multimodal semantic data fusion model that determines the most probable emotion of the subject. Two types of models based on geometric face features for facial expression recognition are being used, depending on the presence or absence of speech. In our approach we define an algorithm that is robust to changes of face shape that occur during regular speech. The influence of phoneme generation on the face shape during speech is removed by using features that are only related to the eyes and the eyebrows. The paper includes results from testing the presented models.

show abstract

Section: Related Workmentioning

confidence: 99%

Semantic Audiovisual Data Fusion for Automatic Emotion Recognition

Datcu¹,

Rothkrantz²

2015

Emotion Recognition

View full text Add to dashboard Cite

show abstract

“…These issues need to be addressed in follow-up studies to obtain a better understanding of the interaction between various expressive cues, sources and modalities in HHI. The multimodal affect systems should potentially be able to detect incongruent messages and label them as incongruent for further/detailed understanding of the information being conveyed (Paleari & Lisetti, 2006) . Different to the cross-mode compensation but still part of the multicue or multimodal perception, there exist findings reporting that when distance is involved humans tend to process the overall global information rather than considering configurations of local regions.…”

Section: Multimodal Expression and Perception Of Emotionsmentioning

confidence: 99%

“…Video features and fNIRS features can be fused at the feature or decision level on a block-by-block basis. (Paleari & Lisetti, 2006) introduce a generic framework with 'resynchronization buffers'. They aim to compare the different estimations, and realign the different evaluations so that they correspond to the same phenomenon even if one estimation is delayed compared to the other one.…”

Section: Challengesmentioning

confidence: 99%

“…However, to date, most of the existing fusion algorithms have not been made adaptive to the input quality and therefore do not consider eventual changes on the reliability of the different information channels. (Paleari & Lisetti, 2006) proposed a generic fusion framework that is able to accept different single and multimodal recognition systems and to automatically adapt the fusion algorithm to find optimal solutions, and be adaptive to channel (and system) reliability. They describe a bufferized approach where two different fusion chains would be active in parallel.…”

Section: Facilitatorsmentioning

confidence: 99%

See 1 more Smart Citation

From the Lab to the Real World: Affect Recognition Using Multiple Cues and Modalities

Güneş¹,

Piccardi²,

Pantić³

2008

Affective Computing

View full text Add to dashboard Cite

“…In human-interaction, 55% of affective information is carried by the body whilst 38% by the voice tone and volume, and only 7% person by the words spoken [1]. Ekman [2] further suggests that non-verbal behaviours are the primary vehicles for expressing emotion.…”

Section: Introductionmentioning

confidence: 99%

Multimodal Affect Recognition in Intelligent Tutoring Systems

Banda

Robinson

2011

Affective Computing and Intelligent Interaction

View full text Add to dashboard Cite

Abstract. This paper concerns the multimodal inference of complex mental states in the intelligent tutoring domain. The research aim is to provide intervention strategies in response to a detected mental state, with the goal being to keep the student in a positive affect realm to maximize learning potential. The research follows an ethnographic approach in the determination of affective states that naturally occur between students and computers. The multimodal inference component will be evaluated from video and audio recordings taken during classroom sessions. Further experiments will be conducted to evaluate the affect component and educational impact of the intelligent tutor.

show abstract

Toward multimodal fusion of affective cues

Cited by 36 publications

References 42 publications

Semantic Audiovisual Data Fusion for Automatic Emotion Recognition

Semantic Audiovisual Data Fusion for Automatic Emotion Recognition

From the Lab to the Real World: Affect Recognition Using Multiple Cues and Modalities

Multimodal Affect Recognition in Intelligent Tutoring Systems

Contact Info

Product

Resources

About