Interspeech 2019 2019
DOI: 10.21437/interspeech.2019-2482
|View full text |Cite
|
Sign up to set email alerts
|

Multi-Modal Sentiment Analysis Using Deep Canonical Correlation Analysis

Abstract: This paper learns multi-modal embeddings from text, audio, and video views/modes of data in order to improve upon downstream sentiment classification. The experimental framework also allows investigation of the relative contributions of the individual views in the final multi-modal embedding. Individual features derived from the three views are combined into a multi-modal embedding using Deep Canonical Correlation Analysis (DCCA) in two ways i) One-Step DCCA and ii) Two-Step DCCA. This paper learns text embedd… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 21 publications
(18 citation statements)
references
References 16 publications
0
18
0
Order By: Relevance
“…We use the CCA to learn a new common space for audio and video modes, and combine the learned audio and video features with the original text embedding. This is because, (Sun et al 2019) showed that using a CCA-based method to correlate audio and video is more effective that correlating audiotext or video-text. • Kernel-CCA: Kernel-CCA (Akaho 2006) introduces a nonlinearity via kernel maps.…”
Section: Baseline Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…We use the CCA to learn a new common space for audio and video modes, and combine the learned audio and video features with the original text embedding. This is because, (Sun et al 2019) showed that using a CCA-based method to correlate audio and video is more effective that correlating audiotext or video-text. • Kernel-CCA: Kernel-CCA (Akaho 2006) introduces a nonlinearity via kernel maps.…”
Section: Baseline Methodsmentioning
confidence: 99%
“…• DCCA: A Deep CCA based algorithm proposed by (Sun et al 2019). Audio and video features are simply concatenated and then be correlated with text features using DCCA.…”
Section: Baseline Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Following [13][14], we use Canonical Correlation Analysis (CCA) for feature alignment, such that L al = -CCA. CCA for deep neural networks, also known as Deep CCA or DCCA, is a method to learn complex nonlinear transformations of data from two different modalities, such that the resulting representations are highly linearly correlated [15].…”
Section: Proposed Methodsmentioning
confidence: 99%
“…[5] 76.5 73. 4 Zadeh et al [7] 76.9 77.0 Georgiou et al [9] 76.9 76.9 Poria et al [2] 77.64 -Ghosal et al [10] 82.31 80.69 Ghosal et al [10] 79.80 -Sun et al [4] 80 [7], ( ¦ ) results are obtained on CMU-MOSEI dataset after excluding the utterances with sentiment score of 0. We mention the results of proposed model with this setup in the parenthesis.…”
Section: Cmu-mosei Approachmentioning
confidence: 99%