2021
DOI: 10.48550/arxiv.2110.03007
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Unsupervised Multimodal Language Representations using Convolutional Autoencoders

Abstract: Multimodal Language Analysis is a demanding area of research, since it is associated with two requirements: combining different modalities and capturing temporal information. During the last years, several works have been proposed in the area, mostly centered around supervised learning in downstream tasks. In this paper we propose extracting unsupervised Multimodal Language representations that are universal and can be applied to different tasks. Towards this end, we map the word-level aligned multimodal seque… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 15 publications
0
2
0
Order By: Relevance
“…multimodal emotion recognition [63,101], and sentiment analysis [101]. Notice that the last two domains both use nonverbal signals to make decisions and the effectiveness of unsupervised pre-training in [63,101,151] was on par with or even better than the several fully-supervised SOTA.…”
Section: Artificial Intelligencementioning
confidence: 99%
See 1 more Smart Citation
“…multimodal emotion recognition [63,101], and sentiment analysis [101]. Notice that the last two domains both use nonverbal signals to make decisions and the effectiveness of unsupervised pre-training in [63,101,151] was on par with or even better than the several fully-supervised SOTA.…”
Section: Artificial Intelligencementioning
confidence: 99%
“…multimodal emotion recognition [63,101], and sentiment analysis [101]. Notice that the last two domains both use nonverbal signals to make decisions and the effectiveness of unsupervised pre-training in [63,101,151] was on par with or even better than the several fully-supervised SOTA. Unfortunately, unsupervised pre-training has not been yet integrated into the face-to-face co-located human-human social interaction analysis.…”
Section: Artificial Intelligencementioning
confidence: 99%