Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021
DOI: 10.18653/v1/2021.emnlp-main.21
|View full text |Cite
|
Sign up to set email alerts
|

Improving Multimodal fusion via Mutual Dependency Maximisation

Abstract: Multimodal sentiment analysis is a trending area of research, and the multimodal fusion is one of its most active topic. Acknowledging humans communicate through a variety of channels (i.e visual, acoustic, linguistic), multimodal systems aim at integrating different unimodal representations into a synthetic one. So far, a consequent effort has been made on developing complex architectures allowing the fusion of these modalities. However, such systems are mainly trained by minimising simple losses such as L 1 … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
4
1

Relationship

3
7

Authors

Journals

citations
Cited by 13 publications
(5 citation statements)
references
References 46 publications
0
5
0
Order By: Relevance
“…Among available contrast measures, the Fisher-Rao distance is parameter-free and thus, it is easy to use in practice while the AB-Divergence achieves better results but requires to select α and β. Future work includes extending our metrics to new tasks such as SLU (Chapuis et al 2020(Chapuis et al , 2021Dinkar et al 2020;Colombo, Clavel, and Piantanida 2021), controlled sentence generation (Colombo et al 2019(Colombo et al , 2021b and multi-modal learning (Colombo et al 2021a;Garcia et al 2019).…”
Section: Summary and Concluding Remarksmentioning
confidence: 99%
“…Among available contrast measures, the Fisher-Rao distance is parameter-free and thus, it is easy to use in practice while the AB-Divergence achieves better results but requires to select α and β. Future work includes extending our metrics to new tasks such as SLU (Chapuis et al 2020(Chapuis et al , 2021Dinkar et al 2020;Colombo, Clavel, and Piantanida 2021), controlled sentence generation (Colombo et al 2019(Colombo et al , 2021b and multi-modal learning (Colombo et al 2021a;Garcia et al 2019).…”
Section: Summary and Concluding Remarksmentioning
confidence: 99%
“…• Recently Colombo et al (2021) conducted experiments introducing a information regularizer on existing architectures. The main differences between the our method and their method are a) our method focuses on synergy terms whereas their proposal is optimizing joint mutual information between different unimodal representations; and b) they experiment with variational measures of information.…”
Section: Modelsmentioning
confidence: 99%
“…Multi-modal fusion, which integrates information from multiple modalities into a compact and informative representation, poses a significant challenge as it requires effectively correlating the semantics of diverse modalities. In recent years, several approaches have been developed to learn the joint embeddings of multiple modalities [1,2]. However, each modality exhibits distinct representations and statistical features, making it difficult to capture complex intermodal correlations.…”
Section: Introductionmentioning
confidence: 99%