2020
DOI: 10.1007/978-3-030-51870-7_3
|View full text |Cite
|
Sign up to set email alerts
|

A Survey on Automatic Multimodal Emotion Recognition in the Wild

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
12
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 40 publications
(13 citation statements)
references
References 131 publications
0
12
0
Order By: Relevance
“…Human-Computer Interaction (HCI) is an essential part of Artificial Intelligence (AI) research. Real-life HCI applications have been facilitated by research on automatic emotion recognition [8], thus improving quality of service and quality of life [9]. The speech signal is one of the key ways for humans to communicate, since it contains a substantial amount of paralinguistic information (i. e., emotion states, attitudes, etc.)…”
Section: Speech Emotion Recognitionmentioning
confidence: 99%
“…Human-Computer Interaction (HCI) is an essential part of Artificial Intelligence (AI) research. Real-life HCI applications have been facilitated by research on automatic emotion recognition [8], thus improving quality of service and quality of life [9]. The speech signal is one of the key ways for humans to communicate, since it contains a substantial amount of paralinguistic information (i. e., emotion states, attitudes, etc.)…”
Section: Speech Emotion Recognitionmentioning
confidence: 99%
“…Studies by Albert Mehrabian in the 80s (Mehrabian 1981) established the 7-38-55% rule, also known as the "3V rule": 7% of communication is verbal, 38% of communication is vocal and 55% of communication is visual. Multimodal Emotion Recognition approaches rely on a combination of facial, body and verbal signals to infer the emotion of a subject (Sharma and Dhall 2021). One state of the art example of multimodal emotion recognition is End-to-End Multimodal Emotion Recognition Using Deep Neural Networks (Tzirakis et al 2017), in which the network comprises of two parts: the multimodal feature extraction part and the RNN part.…”
Section: Multimodal Emotion Recognitionmentioning
confidence: 99%
“…Numerous publications in this field are collected through extensive surveys, including [23][24][25]. Since the audio channel does not provide continuity in recognising emotions, and the effectiveness of recognition depends on many factors, including voice quality, this source of information is very often used in multimodal systems [26][27][28]. For example, an interesting emotographic model analyses indicators residing in standard multimodal data produced by commonly used applications and Internet of things (IoT) devices to interpret human emotional state [22].…”
Section: Emotion Recognitionmentioning
confidence: 99%