2022
DOI: 10.48550/arxiv.2201.03969
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Multimodal Representations Learning Based on Mutual Information Maximization and Minimization and Identity Embedding for Multimodal Sentiment Analysis

Abstract: Multimodal sentiment analysis (MSA) is a fundamental complex research problem due to the heterogeneity gap between different modalities and the ambiguity of human emotional expression. Although there have been many successful attempts to construct multimodal representations for MSA, there are still two challenges to be addressed: 1) A more robust multimodal representation needs to be constructed to bridge the heterogeneity gap and cope with the complex multimodal interactions, and 2) the contextual dynamics mu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 26 publications
0
1
0
Order By: Relevance
“…MMIM also applies MI maximization between the fusion result and input modalities again, ensuring that the fusion output sufficiently captures modality-invariant clues among the modalities. Like Han et al [96], Zheng et al [97] also introduced a multimodal representation model, MMMIE, grounded in the principles of mutual information maximization, minimization, and identity embedding. This model aims to maximize the mutual information between modalities while minimizing the mutual information between input data and its features, extracting modality-invariant and task-related information.…”
Section: Simple Concatenation Fusionmentioning
confidence: 99%
“…MMIM also applies MI maximization between the fusion result and input modalities again, ensuring that the fusion output sufficiently captures modality-invariant clues among the modalities. Like Han et al [96], Zheng et al [97] also introduced a multimodal representation model, MMMIE, grounded in the principles of mutual information maximization, minimization, and identity embedding. This model aims to maximize the mutual information between modalities while minimizing the mutual information between input data and its features, extracting modality-invariant and task-related information.…”
Section: Simple Concatenation Fusionmentioning
confidence: 99%