2019
DOI: 10.1609/aaai.v33i01.33015652
|View full text |Cite
|
Sign up to set email alerts
|

Deep Robust Unsupervised Multi-Modal Network

Abstract: In real-world applications, data are often with multiple modalities, and many multi-modal learning approaches are proposed for integrating the information from different sources. Most of the previous multi-modal methods utilize the modal consistency to reduce the complexity of the learning problem, therefore the modal completeness needs to be guaranteed. However, due to the data collection failures, self-deficiencies, and other various reasons, multi-modal instances are often incomplete in real applications, a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 19 publications
(19 reference statements)
0
4
0
Order By: Relevance
“…(1) Dense features: Since ETA is a time prediction problem, the time-related features are particularly important, so we calculated the sum and maximum value of link-time and crosstime [9]. In addition, the state of road conditions is a key piece of information.…”
Section: Feature Engineeringmentioning
confidence: 99%
“…(1) Dense features: Since ETA is a time prediction problem, the time-related features are particularly important, so we calculated the sum and maximum value of link-time and crosstime [9]. In addition, the state of road conditions is a key piece of information.…”
Section: Feature Engineeringmentioning
confidence: 99%
“…This involves comprehending the given image through computer vision techniques and generating corresponding descriptions using natural language processing. Initially, researchers explored the encoder-decoder architecture (Yang et al 2019a;Zhang et al 2019;Yang et al 2019c) with CNNs (Albawi, Mohammed, and Al-Zawi 2017) as image encoders and LSTM (Greff et al 2017) as text decoders (Vinyals et al 2015). To consider local and global features simultaneously, (Huang et al 2019) used image regions to decode image segmentations sequentially to words, adding attention mechanisms to focus on specific image regions during decoding.…”
Section: Introductionmentioning
confidence: 99%
“…Later on, Feng and Zhou [24] show that random forests can do auto-encoder, implying that the informative rules of decision trees may accomplish representation learning. Deep forest is extended to numerous tasks and is successfully applied in metric learning [25] , multi-label learning [26] , semi-supervised learning [27] , financial fraud detection [28,29] , etc. Deep forests, on the other hand, require a significant amount of memory and time due to the storing of multi-layer forest modules to do layer-bylayer prediction on the test set.…”
Section: Introductionmentioning
confidence: 99%