2022
DOI: 10.1007/s11042-022-13254-8
|View full text |Cite
|
Sign up to set email alerts
|

Building a three-level multimodal emotion recognition framework

Abstract: Multimodal emotion detection has been one of the main lines of research in the field of Affective Computing (AC) in recent years. Multimodal detectors aggregate information coming from different channels or modalities to determine what emotion users are expressing with a higher degree of accuracy. However, despite the benefits offered by this kind of detectors, their presence in real implementations is still scarce for various reasons. In this paper, we propose a technology-agnostic framework, HERA, to facilit… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2023
2023
2025
2025

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(2 citation statements)
references
References 62 publications
0
2
0
Order By: Relevance
“…Early MSA often focused on single modality information such as sound, text, visual, and biological signals. However, using a single modality for MSA users often did not accurately analyze their sentiments (Salhi et al, 2021;Mohammed et al, 2022;Garcia-Garcia, 2023). Because the same text may express opposite meanings in different contexts, it is difficult to accurately predict users' sentiments based solely on one modality.…”
Section: Introductionmentioning
confidence: 99%
“…Early MSA often focused on single modality information such as sound, text, visual, and biological signals. However, using a single modality for MSA users often did not accurately analyze their sentiments (Salhi et al, 2021;Mohammed et al, 2022;Garcia-Garcia, 2023). Because the same text may express opposite meanings in different contexts, it is difficult to accurately predict users' sentiments based solely on one modality.…”
Section: Introductionmentioning
confidence: 99%
“…However, despite the progress of ER using Deep Learning techniques that is shown in a large number of studies, most of them use datasets built in laboratory environments, such as IEMOCAP [20,23], AMIGOS [24], RECOLA [28], DEAP [21], SEED, SEED-IV, SEED-V, DEAP and DREAMER [25], etc. The fundamental problem with any recognition system is the lack of data or the training with real data which may affect its generalization to examples that have not been seen during the training process [21,35]. Furthermore, the datasets used for training of ER models have been designed in controlled laboratory environments and are significantly different from what happens in real conditions in terms of brightness, noise level, etc.…”
Section: Introductionmentioning
confidence: 99%