Proceedings of the 2018 on Audio/Visual Emotion Challenge and Workshop 2018
DOI: 10.1145/3266302.3266314
|View full text |Cite
|
Sign up to set email alerts
|

Speech-based Continuous Emotion Prediction by Learning Perception Responses related to Salient Events

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
8
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(9 citation statements)
references
References 14 publications
1
8
0
Order By: Relevance
“…The presented results demonstrate that the proposed CNN architecture outperforms the LSTM architecture in all tasks, except for arousal on the Development and the Hungarian Test set, where matching results are achieved. While it could be argued that the hyperparameters of the LSTM model allow for more tuning, also the multi-modal baselines of AVEC and all tuned systems based on only the audio modality are outperformed on the German Test set for arousal [22,25,27]. For valence, our system is superior compared to three other models, considering the intra-cultural evaluation.…”
Section: Discussionmentioning
confidence: 86%
See 1 more Smart Citation
“…The presented results demonstrate that the proposed CNN architecture outperforms the LSTM architecture in all tasks, except for arousal on the Development and the Hungarian Test set, where matching results are achieved. While it could be argued that the hyperparameters of the LSTM model allow for more tuning, also the multi-modal baselines of AVEC and all tuned systems based on only the audio modality are outperformed on the German Test set for arousal [22,25,27]. For valence, our system is superior compared to three other models, considering the intra-cultural evaluation.…”
Section: Discussionmentioning
confidence: 86%
“…Time-dependent modelling was widely proposed in the contributions to AVEC. In the approach by Wataraka Gamage et al, emotional dimensions are modelled as the outputs of time-invariant filter arrays, each filter representing a 'salient event' [22]. Huang et al employ a fusion of different (handcrafted and deep) feature sets and an LSTM-RNN [14] and investigate on data augmentation by cutting and overlapping the long sequences.…”
Section: Corpusmentioning
confidence: 99%
“…Interestingly, approaches in the AVEC 2018 CES did not employ approaches such as transfer learning [80,81] or domain adaptation techniques [29,54] typically seen in cross-cultural testing. In [76], the authors proposed a model based on emotional salient detection to identify emotion markers invariant to sociocultural context. The other two entrants employed data driven approaches based on long short-term memory recurrent neural networks (LSTM-RNN) [27,82].…”
Section: Cross-cultural Emotion Recognitionmentioning
confidence: 99%
“…Finally, we end our review by mentioning one more alternative class of models: event-based models such as point-process models [76], [77], [78], [79], which aim to model the time and intensity of events and their impact on a dependent variable. Although event-based approaches are not common within affective computing, one notable and recent example is [80], who proposed an event-filter model to predict valence and arousal over time from speech events. In their model, a vocal event j, occuring at times predicted by ϕ j (t), produces an emotional "response" h j (t).…”
Section: Integrating Discriminative and Generative Approachesmentioning
confidence: 99%
“…Then, the emotional signal Y (t) is then proportional to the sum of the convolution h j (t) ϕ j (t) across all events j. They tested their eventfilter model on the AVEC 2018 dataset, and it performed better than the audio-channel-only baselines for the AVEC 2017 and 2018 challenges [80]. While we do not go further into event-based models in this paper, we do think it offers an alternative approach to modeling time-series emotions, which should be explored more in future research.…”
Section: Integrating Discriminative and Generative Approachesmentioning
confidence: 99%