Proceedings of the 1st International on Multimodal Sentiment Analysis in Real-Life Media Challenge and Workshop 2020
DOI: 10.1145/3423327.3423670
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised Representation Learning with Attention and Sequence to Sequence Autoencoders to Predict Sleepiness From Speech

Abstract: Motivated by the attention mechanism of the human visual system and recent developments in the field of machine translation, we introduce our attention-based and recurrent sequence to sequence autoencoders for fully unsupervised representation learning from audio files. In particular, we test the efficacy of our novel approach on the task of speech-based sleepiness recognition. We evaluate the learnt representations from both autoencoders, and conduct an early fusion to ascertain possible complementarity betwe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
9
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 9 publications
(9 citation statements)
references
References 35 publications
0
9
0
Order By: Relevance
“…Autoencoders are unsupervised learning models that are used for sequence-to-sequence modeling [76]. The input is studied by the encoder that generates a code label corresponding to it.…”
Section: Auto-encodersmentioning
confidence: 99%
“…Autoencoders are unsupervised learning models that are used for sequence-to-sequence modeling [76]. The input is studied by the encoder that generates a code label corresponding to it.…”
Section: Auto-encodersmentioning
confidence: 99%
“…Using the conditional probabilities of phrase pairings generated by the RNN Encoder-Decoder as an extra feature in the current log-linear model, the performance of the statistical machine translation system is shown to increase empirically. Shahin Amiriparian et al [3]. For unsupervised representation learning, propose a recurrent sequence to sequence autoencoder.…”
Section: Related Workmentioning
confidence: 99%
“…To learn a representation (encoding), we utilise an auto encoder, and to categorise predictions, we employ LSTM networks. [2][3], [6-8][16].…”
Section: Introductionmentioning
confidence: 99%
“…They achieved respective correlations of ρ = 0.369 and ρ = 0.343 while the winner of the challenge, who used Fischer vectors and bag-of-features ( 24 ), achieved a correlation of ρ = 0.387. Since then, two other systems using deep learning have been proposed ( 25 , 26 ), both achieving performances below the winner of the challenge (resp. ρ = 0.317 and ρ = 0.367).…”
Section: Introductionmentioning
confidence: 99%