Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency 2020
DOI: 10.1145/3351095.3372837
|View full text |Cite
|
Sign up to set email alerts
|

Reducing sentiment polarity for demographic attributes in word embeddings using adversarial learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
18
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 26 publications
(18 citation statements)
references
References 16 publications
0
18
0
Order By: Relevance
“…We train the encoder 𝑓 and discriminator 𝑔 using adversarial training so that we hope the embedding learned via encoder 𝑓 can fool the discriminator 𝑔. Training such a network is challenging and we take advantage of adversarial learning since it has shown promising results for other fairness tasks such as removing unfairness in NLP applications [17,40]. We use adversarial learning to de-correlate the relationships between protected status variable 𝑍 and feature vectors encoded via 𝑓 (𝜃 ).…”
Section: Methods 41 Learning Overviewmentioning
confidence: 99%
See 1 more Smart Citation
“…We train the encoder 𝑓 and discriminator 𝑔 using adversarial training so that we hope the embedding learned via encoder 𝑓 can fool the discriminator 𝑔. Training such a network is challenging and we take advantage of adversarial learning since it has shown promising results for other fairness tasks such as removing unfairness in NLP applications [17,40]. We use adversarial learning to de-correlate the relationships between protected status variable 𝑍 and feature vectors encoded via 𝑓 (𝜃 ).…”
Section: Methods 41 Learning Overviewmentioning
confidence: 99%
“…[17] shows that demographic information leaks into intermediate representations of neural networks trained on text datasets and applies adversarial learning to mitigate the information leaks. [40] takes the advantages of adversarial networks to reduce word vector sentiment bias for demographic identity terms.…”
Section: Related Workmentioning
confidence: 99%
“…For instance, a model can memorize protected attributes and allow their disclosure later on (Kumar et al 2019). Another model threat regards how language models address human discriminatory biases from their training text corpora Sweeney and Najafian 2020). Moreover, the computation scenario, such as centralized cloud servers or distributed processing architectures, plays an important role in the existence of privacy threats since many of them are related to the misbehavior of components.…”
Section: Privacy Issues For Nlpmentioning
confidence: 99%
“…Model properties, such as vocabulary and neural network layers, can be used by an adversary to perform attacks that end up disclosing private data used for training (Alawad et al 2020). Additional privacy-related issues from NLP models concern biases (Sweeney and Najafian 2020; Gencoglu 2020; Tan and Celis 2019) and the unfair decisions made by biased models (Sweeney and Najafian 2020;Xu et al 2019). Therefore, from the perspective of NLP models, privacy threats are as follows.…”
Section: Threats From Modelsmentioning
confidence: 99%
“…Another approach to fairness in toxicity classification focuses on developing methods to mitigate bias in these systems. Several approaches have been proposed to debias word embeddings [29,33], and various methods have been proposed that directly target the outcomes of toxicity classification systems: [35] uses a multi-task learning model with an attention layer that jointly learn demographic information and toxicity in order to reduce bias, whereas [13] takes a causal approach that imposes a fairness penalty on models that predict different scores for comments that differ only in their use of identity terms. To our knowledge, we are the first group to apply tools from Domain Generalization to a fair toxicity classification task.…”
Section: Fairness In Toxic Comment Detectionmentioning
confidence: 99%