2021
DOI: 10.48550/arxiv.2106.11059
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Improving Multi-Modal Learning with Uni-Modal Teachers

Chenzhuang Du,
Tingle Li,
Yichen Liu
et al.

Abstract: Learning multi-modal representations is an essential step towards real-world robotic applications, and various multi-modal fusion models have been developed for this purpose. However, we observe that existing models, whose objectives are mostly based on joint training, often suffer from learning inferior representations of each modality. We name this problem Modality Failure, and hypothesize that the imbalance of modalities and the implicit bias of common objectives in fusion method prevent encoders of each mo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(5 citation statements)
references
References 37 publications
0
5
0
Order By: Relevance
“…Wang et al [14] used a gradient mixing method to solve the problem of varying generalization speeds for different modal data. Du et al [15] used single mode self distillation and distillation fusion features to solve the problem of modal imbalance and implicit bias of common targets in fusion methods. Peng [16] proposed using dynamic gradient modulation for adaptive control, optimizing the contribution of each mode to the learning objectives by monitoring their differences, in order to achieve modal equilibrium.…”
Section: Related Workmentioning
confidence: 99%
“…Wang et al [14] used a gradient mixing method to solve the problem of varying generalization speeds for different modal data. Du et al [15] used single mode self distillation and distillation fusion features to solve the problem of modal imbalance and implicit bias of common targets in fusion methods. Peng [16] proposed using dynamic gradient modulation for adaptive control, optimizing the contribution of each mode to the learning objectives by monitoring their differences, in order to achieve modal equilibrium.…”
Section: Related Workmentioning
confidence: 99%
“…Several recent studies [5,40,41] have shown that many multimodal DNNs cannot achieve better performance compared to the best single-modal DNNs. Wang et al [40] found that different modalities overfit and generalize at different rates and thus obtain suboptimal solutions when jointly training them using a unified optimization strategy.…”
Section: Imbalanced Multimodal Learningmentioning
confidence: 99%
“…Several approaches have been proposed recently to deal with the modal imbalance problem [5,29,40,44]. Wang et al [40] used additional classifiers for each modality and its fusion modality and then optimized the gradient mixing problem they introduced to obtain better weights for each branch.…”
Section: Imbalanced Multimodal Learningmentioning
confidence: 99%
See 2 more Smart Citations