Emotion Recognition 2015
DOI: 10.1002/9781118910566.ch16
|View full text |Cite
|
Sign up to set email alerts
|

Semantic Audiovisual Data Fusion for Automatic Emotion Recognition

Abstract: The paper describes a novel technique for the recognition of emotions from multimodal data. We focus on the recognition of the six prototypic emotions. The results from the facial expression recognition and from the emotion recognition from speech are combined using a bi-modal multimodal semantic data fusion model that determines the most probable emotion of the subject. Two types of models based on geometric face features for facial expression recognition are being used, depending on the presence or absence o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
31
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 40 publications
(31 citation statements)
references
References 25 publications
0
31
0
Order By: Relevance
“…[15] developed a facial expression coding system (FACS) and six facial expressions that are able to provide sufficient clues to detect emotions. Recent studies on speech-based emotion analysis [4] have focused on identifying several acoustic features. One of the earliest works on fusing audio-visual emotion recognition [16] showed that a bimodal system yields higher accuracy than any unimodal system.…”
Section: Related Work and Contributionsmentioning
confidence: 99%
See 1 more Smart Citation
“…[15] developed a facial expression coding system (FACS) and six facial expressions that are able to provide sufficient clues to detect emotions. Recent studies on speech-based emotion analysis [4] have focused on identifying several acoustic features. One of the earliest works on fusing audio-visual emotion recognition [16] showed that a bimodal system yields higher accuracy than any unimodal system.…”
Section: Related Work and Contributionsmentioning
confidence: 99%
“…People, however, are gradually shifting from text to video to express their opinion about a product or service, as it is now much easier and faster for them to produce and share multimodal content [3]. For the same reasons, potential customers are now more inclined to browse for video reviews of the product they are interested in, rather than looking for lengthy written reviews [4]. Another reason for doing this is that, while trustable written reviews are quite hard to find, searching for good video reviews is as easy as typing the name of the product on YouTube and choosing the clips with more views [5].…”
Section: Introductionmentioning
confidence: 99%
“…Processing Classification algorithm Accuracy Lanitis et al [11] Appearance Model Distance-based 74% Cohen et al [16] Appearance Model Bayesian network 83% Mase [13] Optical flow kNN 86% Rosenblum et al [17] Recent studies on speech-based emotion analysis [12,[21][22][23][24][25][26][27] have focused on identifying several acoustic features such as fundamental frequency (pitch), intensity of utterance [15], bandwidth, and duration. The speaker-dependent approach gives much better results than the speaker-independent approach, as shown by the excellent results of Navas et al [29], where about 98% accuracy was achieved by using the Gaussian mixture model (GMM) as a classifier, with prosodic, voice quality as well as Mel frequency cepstral coefficient (MFCC) employed as speech features.…”
Section: Methodsmentioning
confidence: 99%
“…The Active Appearance Model [11,12] and Optical Flow-based techniques [13] are common approaches that use FACS to understand expressed facial expressions. Exploiting AUs as features, kNN, Bayesian networks, hidden Markov models (HMM) and artificial neural networks (ANN) [14] have been used by many researchers to infer emotions from facial expressions.…”
Section: Video: Recognition Of Facial Expressionmentioning
confidence: 99%
See 1 more Smart Citation