2021
DOI: 10.1109/taslp.2021.3068598
|View full text |Cite
|
Sign up to set email alerts
|

Analyzing Multimodal Sentiment Via Acoustic- and Visual-LSTM With Channel-Aware Temporal Convolution Network

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
18
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 66 publications
(20 citation statements)
references
References 40 publications
2
18
0
Order By: Relevance
“…We can infer from Fig. 2 that, the language modality is the most informative modality that is rarely filtered out (and this conclusion is consistent with other works (Mai et al, 2021b)). Contrary to it, the acoustic modality is frequently identified as noisy and filtered out which is the most uninformative modality.…”
Section: Analysis On the Modality Importancesupporting
confidence: 90%
See 1 more Smart Citation
“…We can infer from Fig. 2 that, the language modality is the most informative modality that is rarely filtered out (and this conclusion is consistent with other works (Mai et al, 2021b)). Contrary to it, the acoustic modality is frequently identified as noisy and filtered out which is the most uninformative modality.…”
Section: Analysis On the Modality Importancesupporting
confidence: 90%
“…Moreover, even with satisfactory unimodal networks, it is not always the case that multimodal models reach higher performance than the unimodal ones (Mai et al, 2021b). The reason may be that, a modality may not contain useful information in some utterances and may even carry noises, which hinders the learning of correct multimodal embedding.…”
Section: Introductionmentioning
confidence: 99%
“…CMU-MOSEI includes labels not only for the emotion recognition task but also for the (text-based) sentiment analysis problem. For this reason, it is mostly used for sentiment analysis architectures (e.g., [63,66,67,70,[83][84][85]). However, its benefits must be adopted by researchers on emotion recognition in order to produce more robust models and gain a better insight into their actual performance.…”
Section: Datasetsmentioning
confidence: 99%
“…However, this study, due to different pretraining, reports smaller amount of parameters for some models (eg. [23] reports 1,549,321 parameters for MulT on the MO-SEI dataset with a different pretraining procedure). However, for the sake of a complete comparison, we choose to report the results of [21], though this may underestimate the number of parameters of the reported models.…”
Section: Model Complexitymentioning
confidence: 99%