2020
DOI: 10.1609/aaai.v34i05.6307
|View full text |Cite
|
Sign up to set email alerts
|

Privacy Enhanced Multimodal Neural Representations for Emotion Recognition

Abstract: Many mobile applications and virtual conversational agents now aim to recognize and adapt to emotions. To enable this, data are transmitted from users' devices and stored on central servers. Yet, these data contain sensitive information that could be used by mobile applications without user's consent or, maliciously, by an eavesdropping adversary. In this work, we show how multimodal representations trained for a primary task, here emotion recognition, can unintentionally leak demographic information, which co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 38 publications
(9 citation statements)
references
References 20 publications
0
8
1
Order By: Relevance
“…AUC scores for ethnicity are important only for video and, to a lesser extent, multimodal models, meaning that some information is retrievable. Finally, contrary to [23], our multimodal model does not show a higher leakage than our unimodal models.…”
Section: B Experiments Resultscontrasting
confidence: 80%
See 2 more Smart Citations
“…AUC scores for ethnicity are important only for video and, to a lesser extent, multimodal models, meaning that some information is retrievable. Finally, contrary to [23], our multimodal model does not show a higher leakage than our unimodal models.…”
Section: B Experiments Resultscontrasting
confidence: 80%
“…Interestingly, these methodologies are also linked to privacy methods where network designers try to protect their system from attackers trying to retrieve personal information from latent representations. Thus, adversarial learning has been used in the context of speech processing [22], deep visual recognition [21], or multimodal (prosodic and verbal content) emotion recognition [23]. However these approaches always rely on the explicit usage of the protected variables.…”
Section: Fairness Via Adversarial Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…In the future, we aim to build our SER model using more complex model structures, e.g., RNN+classifer. We also wish to apply the defense mechanism, such as adversarial training shown in [41], to train the SER model in the FL set up. Finally, we wish to evaluate the membership inference attack within similar experimental settings.…”
Section: Discussionmentioning
confidence: 99%
“…In [61] Nautsch et al investigate the importance of the development of privacy-preserving technologies to protect speech signals and highlight the importance of applying these technologies to protect speakers and speech characterization in recordings. Some recent works have sought to protect speaker identity [67], gender identity [33] and emotion [2]. VoiceMask, for example, was proposed to mitigate the security and privacy risks of voice input on mobile devices by concealing voiceprints [67].…”
Section: Related Workmentioning
confidence: 99%