Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing 2017
DOI: 10.18653/v1/d17-1115
|View full text |Cite
|
Sign up to set email alerts
|

Tensor Fusion Network for Multimodal Sentiment Analysis

Abstract: Multimodal sentiment analysis is an increasingly popular research area, which extends the conventional language-based definition of sentiment analysis to a multimodal setup where other relevant modalities accompany language. In this paper, we pose the problem of multimodal sentiment analysis as modeling intra-modality and inter-modality dynamics. We introduce a novel model, termed Tensor Fusion Network, which learns both such dynamics end-to-end. The proposed approach is tailored for the volatile nature of spo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
482
1
2

Year Published

2018
2018
2023
2023

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 1,056 publications
(487 citation statements)
references
References 51 publications
2
482
1
2
Order By: Relevance
“…Recently, Poria et al successfully used RNN‐based deep networks for multimodal emotion recognition and it was followed by other works by Zadeh et al; Hazarika et al, used memory networks for emotion recognition in dyadic conversations, where two distinct memory networks enabled interspeaker interaction, yielding state‐of‐the‐art performance.…”
Section: Related Workmentioning
confidence: 99%
“…Recently, Poria et al successfully used RNN‐based deep networks for multimodal emotion recognition and it was followed by other works by Zadeh et al; Hazarika et al, used memory networks for emotion recognition in dyadic conversations, where two distinct memory networks enabled interspeaker interaction, yielding state‐of‐the‐art performance.…”
Section: Related Workmentioning
confidence: 99%
“…There are many commercial and social applications related to sentiment analysis. Actually, sentiment analysis can utilize multimodal data including text, speech and video [33]- [35]. In natural language processing, a basic sentiment analysis task is to classify the polarity of a given text at the document, sentence, or feature and aspect level.…”
Section: B Sentiment Analysis (Sa)mentioning
confidence: 99%
“…Methods that learn the modalities independently and fuse the output of modality specific representations [1,2], 2. Methods that jointly learn the interactions between two or three modalities [3,4], and 3. Methods that explicitly learn contributions from these unimodal and cross modal cues, typically using attention based techniques [5,6,7,8,9,10].…”
Section: Introductionmentioning
confidence: 99%
“…Most of the existing approaches propose either fusion at different granularities [3,9] or use a cross interaction block that couple the features from different modalities [10,6].…”
Section: Introductionmentioning
confidence: 99%