2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2016
DOI: 10.1109/icassp.2016.7472192
|View full text |Cite
|
Sign up to set email alerts
|

A multimodal mixture-of-experts model for dynamic emotion prediction in movies

Abstract: Please refer to published version for the most recent bibliographic citation information. If a published version is known of, the repository item page linked to above, will contain details on accessing it.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
32
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 24 publications
(32 citation statements)
references
References 16 publications
0
32
0
Order By: Relevance
“…This ambiguity might prevent the model parameters for the couples in the "gray-zone" from being adequately learned. A way to address this limitation would be to perform soft clustering, according to which each couple is assigned to a probability of belonging to a given cluster, implemented through mixture of experts methodologies [27,62]. In this way, the parameters of each cluster will be learned using all samples, each weighted with an importance proportional to the strength of its belonging to a given cluster, potentially yielding more robust representations.…”
Section: Discussionmentioning
confidence: 99%
“…This ambiguity might prevent the model parameters for the couples in the "gray-zone" from being adequately learned. A way to address this limitation would be to perform soft clustering, according to which each couple is assigned to a probability of belonging to a given cluster, implemented through mixture of experts methodologies [27,62]. In this way, the parameters of each cluster will be learned using all samples, each weighted with an importance proportional to the strength of its belonging to a given cluster, potentially yielding more robust representations.…”
Section: Discussionmentioning
confidence: 99%
“…When signals propagate through the network, a conditional layer learns which expert networks to activate, and so that the various combinations of experts are flexible under different circumstances. It was proven to outperform popular fusion strategies in dynamic emotion prediction using visual-audio [49], [50]. For developing DMoE model, we apply fully-connected layers for both expert and gating networks.…”
Section: Multimodal Fusion Modulesmentioning
confidence: 99%
“…Over the past three years or so, higher level state prediction has also been investigated with various types of data collected from music [164][165][166], video [167,168] and EEG signals [169,170]. Despite the increasing interest in new modalities such as the EEG, most work in this area has concerned predictive medical applications [128,171], although a study on improving driving safety by predicting emotion from EGG signals can be found [51].…”
Section: Prediction Of Higher Level Individual Characteristicsmentioning
confidence: 99%