2015 International Conference on Affective Computing and Intelligent Interaction (ACII) 2015
DOI: 10.1109/acii.2015.7344602
|View full text |Cite
|
Sign up to set email alerts
|

Bimodal feature-based fusion for real-time emotion recognition in a mobile context

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(6 citation statements)
references
References 28 publications
0
6
0
Order By: Relevance
“…For the lexical modality, sparse features drawn from hand crafted affective dictionaries are dominant in current studies, e.g., Linguistic Inquiry and Word Count (LIWC) [14] based lexical features [15] and WordNetAffect [16] based lexical features [17]. However, current Paralinguistic studies on human-human dialogue suggest that besides lexical content, other phenomena in speech are also indicators of emotion.…”
Section: Featuresmentioning
confidence: 99%
See 1 more Smart Citation
“…For the lexical modality, sparse features drawn from hand crafted affective dictionaries are dominant in current studies, e.g., Linguistic Inquiry and Word Count (LIWC) [14] based lexical features [15] and WordNetAffect [16] based lexical features [17]. However, current Paralinguistic studies on human-human dialogue suggest that besides lexical content, other phenomena in speech are also indicators of emotion.…”
Section: Featuresmentioning
confidence: 99%
“…In FL fusion (e.g., [15]), feature sets from different modalities are concatenated before performing recognition, as shown in Figure 1. In some studies, feature engineering is first applied to the concatenated feature set or individual feature sets (e.g., [17]). However, it is hard to apply knowledge about different modalities in FL fusion.…”
Section: Multimodal Emotion Recognitionmentioning
confidence: 99%
“…The authors also reported an accuracy of 80.36% for audio-visual emotion recognition while employing a hybrid deep model architecture [23]. Another use of the corpus was made in real-time bi-modal emotion detection in a mobile context, where seven emotions were classified, and the results were reported in terms of precision (90.8), recall (90.7), and F1-measure (90.7) for a feature level fusion [156].…”
Section: Rmlmentioning
confidence: 99%
“…SAVEE DB was used for exploring the sources of temporal variation in human audio-visual behavioral data by introducing temporal segmentation and timeseries analysis techniques [19]. In a bi-modal fusion of linguistic and acoustic cues in speech, SAVEE was used for affect recognition at the language level using both ML and valence assessment of the words for the classification of 7 emotions [156]. In an affective human-robot interaction, the real-time fusion of facial expressions and speech from SAVEE using 3 DBNs (two for classifying and the third for fusing the o/p of the first two) resulted in an accuracy of 96.2%.…”
Section: Saveementioning
confidence: 99%
“…Therefore, there are several types of data sources such as speech, text, facial expression, body movement, and physiological measurement like EEG, finger temperature, skin conductance level, heart rate, and muscle activity [11,13,14,69]–[73]. There is a mass of emotion data for speech, text, and facial expression as these are details that can be easily gathered from devices used by people on a daily basis, such as cellphones and computers [75,76]. On the other hand, physiological signals can also be a good source of information since they could be collected continuously without participants interfering [77,78].…”
Section: Eegmentioning
confidence: 99%