Interspeech 2017 2017
DOI: 10.21437/interspeech.2017-1421
|View full text |Cite
|
Sign up to set email alerts
|

Adversarial Auto-Encoders for Speech Based Emotion Recognition

Abstract: Recently, generative adversarial networks and adversarial autoencoders have gained a lot of attention in machine learning community due to their exceptional performance in tasks such as digit classification and face recognition. They map the autoencoder's bottleneck layer output (termed as code vectors) to different noise Probability Distribution Functions (PDFs), that can be further regularized to cluster based on class information. In addition, they also allow a generation of synthetic samples by sampling th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
46
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 59 publications
(46 citation statements)
references
References 24 publications
0
46
0
Order By: Relevance
“…This allows for the intermediate representation to discover a more noise robust representation that can work well across domains. Further work has examined different variations of autoencoders for speech emotion recognition, including Variational Autoencoders (VAE) [28], [29], Adversarial Autoencoders (AAE) [29], [30], [31], and Adversarial Variational Bayes (AVB) [29].…”
Section: Domain Generalizationmentioning
confidence: 99%
See 1 more Smart Citation
“…This allows for the intermediate representation to discover a more noise robust representation that can work well across domains. Further work has examined different variations of autoencoders for speech emotion recognition, including Variational Autoencoders (VAE) [28], [29], Adversarial Autoencoders (AAE) [29], [30], [31], and Adversarial Variational Bayes (AVB) [29].…”
Section: Domain Generalizationmentioning
confidence: 99%
“…The novelty of this paper includes: (1) the ADDoG model, which allows for better generalized representation convergence by "meeting in the middle"; (2) the MADDoG method, which extends ADDoG to allow for many dataset differences to be CycleGAN [16], [17], [18] DAA [4], [27] VAE [28], [29] AAE [29], [30], [31] AVB [29] ADDA [19] DANN [20], [21] ADDoG MADDoG Fig. 1.…”
Section: Introductionmentioning
confidence: 99%
“…For speech emotion recognition, researchers in [97] implemented an adversarial autoencoder model. In this work, highdimensional feature vectors of real data are encoded into 2-D dimensional representations, and a discriminator is learnt to distinguish real 2-D vectors from generated 2-D vectors.…”
Section: A Data Augmentationmentioning
confidence: 99%
“…The experiments indicate that the 2-D representations of real data can yield suitable margins between different emotion categories. Additionally, when adding the generated data to the original data for training, performance can be marginally increased [97].…”
Section: A Data Augmentationmentioning
confidence: 99%
“…I pursue a data augmentation approach to aid sentiment analysis in this paper. Data augmentation through GANs has been used in other tasks such as enhancing emotion recognition through speech [20,21], human pose estimation [22] and medical image synthesis [23]. In my work, I use a variant of cGAN and apply various heuristics to achieve their convergence.…”
Section: Introductionmentioning
confidence: 99%