2022
DOI: 10.48550/arxiv.2202.09195
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Review on Methods and Applications in Multimodal Deep Learning

Abstract: Deep Learning has implemented a wide range of applications and has become increasingly popular in recent years. The goal of multimodal deep learning (MMDL) is to create models that can process and link information using various modalities. Despite the extensive development made for unimodal learning, it still cannot cover all the aspects of human learning. Multimodal learning helps to understand and analyze better when various senses are engaged in the processing of information. This paper focuses on multiple … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
4
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 90 publications
(144 reference statements)
0
4
0
Order By: Relevance
“…The survey focuses on audio, video, and text modalities. Another article [8] reviewed applied methods and applications in multimodal deep learning, where the authors concentrated on a few common deep learning (DL) methods and applications. Apart from this article, we discovered a few surveys that discussed how MML could solve different problems related to modalities.…”
Section: Related Studiesmentioning
confidence: 99%
See 1 more Smart Citation
“…The survey focuses on audio, video, and text modalities. Another article [8] reviewed applied methods and applications in multimodal deep learning, where the authors concentrated on a few common deep learning (DL) methods and applications. Apart from this article, we discovered a few surveys that discussed how MML could solve different problems related to modalities.…”
Section: Related Studiesmentioning
confidence: 99%
“…Although article [1] focused on three modalities, this paper contained all the possible modalities. Article [8] discussed the current use of ML methods and applications in MML, but they limited their review by selecting typical ML methods and applications. Contrary this study presented all ML algorithms, domains, and applications there were available in the search range.…”
Section: Related Studiesmentioning
confidence: 99%
“…Multi-Modal Translation, defined as the task to transfer or translate knowledge from a source modality to a target one [22], enables one to learn a mapping from a source modality to a target one. Multi-Modal Translation includes variety of applications, such as Image Captioning [8] (generation of a textual representation from an image) and Multi-Modal Speech synthesis [22] (generating audio given its textual representation).…”
Section: Multi-source Fault Diagnosismentioning
confidence: 99%
“…Multi-Modal Translation, defined as the task to transfer or translate knowledge from a source modality to a target one [22], enables one to learn a mapping from a source modality to a target one. Multi-Modal Translation includes variety of applications, such as Image Captioning [8] (generation of a textual representation from an image) and Multi-Modal Speech synthesis [22] (generating audio given its textual representation). It is worth mentioning that Multi-Modal translation where the target modality is high-dimensional can get extremely challenging; one way to respond to this challenge is translating to a low-dimensional representation of the target modality containing higher level of semantic information in comparison with the input belonging to the source modality [27].…”
Section: Multi-source Fault Diagnosismentioning
confidence: 99%