Proceedings of the 2021 International Conference on Multimodal Interaction 2021
DOI: 10.1145/3462244.3479919
|View full text |Cite
|
Sign up to set email alerts
|

Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal Sentiment Analysis

Abstract: Multimodal sentiment analysis aims to extract and integrate semantic information collected from multiple modalities to recognize the expressed emotions and sentiment in multimodal data. This research area's major concern lies in developing an extraordinary fusion scheme that can extract and integrate key information from various modalities. However, previous work is restricted by the lack of leveraging dynamics of independence and correlation between modalities to reach top performance. To mitigate this, we pr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
40
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 127 publications
(41 citation statements)
references
References 37 publications
1
40
0
Order By: Relevance
“…• LMF: The model uses tensors to explore the interactions between modes and uses low-rank decomposition to alleviate the problem of number of parameters. • MFM: To enhance the robustness of the model of capturing intra-and inter modality dynamics, MFM is a cycle style generative-discriminative model [14]. • MulT: Multimodal Transformer constructs an architecture unimodal and crossmodal transformer networks and complete fusion process by attention [15].…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…• LMF: The model uses tensors to explore the interactions between modes and uses low-rank decomposition to alleviate the problem of number of parameters. • MFM: To enhance the robustness of the model of capturing intra-and inter modality dynamics, MFM is a cycle style generative-discriminative model [14]. • MulT: Multimodal Transformer constructs an architecture unimodal and crossmodal transformer networks and complete fusion process by attention [15].…”
Section: Methodsmentioning
confidence: 99%
“…The CMU-MOSEI dataset is an upgraded version of CMU-MOSI concerning the number of samples. It is also enriched in terms of the versatility of speakers and covers a broader scope of topics [14].…”
Section: Datasetsmentioning
confidence: 99%
See 2 more Smart Citations
“…Most of the high performance of existing models are dependent on a great number of learnable parameters [ 15 , 16 ], ignoring the potential application in some promising areas like human–computer interaction (HCI), which requires real-time and light models. Thus, a lightweight model is necessary to improve the feasibility and practicability of the application of speech emotion recognition.…”
Section: Introductionmentioning
confidence: 99%