Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing 2018
DOI: 10.18653/v1/d18-1280
|View full text |Cite
|
Sign up to set email alerts
|

ICON: Interactive Conversational Memory Network for Multimodal Emotion Detection

Abstract: Emotion recognition in conversations is crucial for building empathetic machines. Current work in this domain do not explicitly consider the inter-personal influences that thrive in the emotional dynamics of dialogues. To this end, we propose Interactive COnversational memory Network (ICON), a multimodal emotion detection framework that extracts multimodal features from conversational videos and hierarchically models the selfand interspeaker emotional influences into global memories. Such memories generate con… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
198
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 329 publications
(198 citation statements)
references
References 38 publications
0
198
0
Order By: Relevance
“…Recent works on ERC, e.g., DialogueRNN [11] or ICON [23], strive to address several key research challenges that make the task of ERC difficult to solve: a) Categorization of emotions: Emotion is defined using two type of models -categorical and dimensional. Categorical model classifies emotion into a fixed number of discrete categories.…”
Section: Research Challengesmentioning
confidence: 99%
See 1 more Smart Citation
“…Recent works on ERC, e.g., DialogueRNN [11] or ICON [23], strive to address several key research challenges that make the task of ERC difficult to solve: a) Categorization of emotions: Emotion is defined using two type of models -categorical and dimensional. Categorical model classifies emotion into a fixed number of discrete categories.…”
Section: Research Challengesmentioning
confidence: 99%
“…Due to the sequential nature of the utterances in conversations, RNNs are used for context generation in the aforemen- utterances train val test train val test IEMOCAP 120 31 5810 1623 SEMAINE 63 32 4368 1430 EmotionLines 720 80 200 10561 1178 2764 MELD 1039 114 280 9989 1109 2610 [23] as in our experiment, we disregard their contextual feature extraction and pre-processing part. and More details can be found in Majumder et al [11].…”
Section: Recent Advancesmentioning
confidence: 99%
“…Various approaches based on deep [7], convolutional [10] and recurrent neural networks [3] have been proposed for speech and video based emotion recognition and text-based sentiment analysis. Variants of such networks with memory blocks [11] and attention mechanisms [12] were developed recently for multi-modal emotion recognition.…”
Section: Introductionmentioning
confidence: 99%
“…Rather, such models are useful in modeling complex inter-modality dynamics. ICON model [11] uses early fusion and dialog-level decision making by considering the histories of the dyadicinteractions. However, such systems require long contextual information which is not readily available in typical real-time interactions between humans and computer.…”
Section: Introductionmentioning
confidence: 99%
“…Although, researchers mainly focus on emotion detection on text in absence of context (Klinger et al, 2018), tipically extracted from social media, recently, there are few works that approach the emotion detection in conversations by using context information (Hazarika et al, 2018b) (Majumder et al, 2018) (Hazarika et al, 2018a). These contextual systems work on long conversations where different users are involved and they use multimodal data, specifically, text, audio and video in order to address the emotion detection problem on large multi-party conversations.…”
Section: Introductionmentioning
confidence: 99%