2022
DOI: 10.1162/tacl_a_00512
|View full text |Cite
|
Sign up to set email alerts
|

Learning Fair Representations via Rate-Distortion Maximization

Abstract: Text representations learned by machine learning models often encode undesirable demographic information of the user. Predictive models based on these representations can rely on such information, resulting in biased decisions. We present a novel debiasing technique, Fairness-aware Rate Maximization (FaRM), that removes protected information by making representations of instances belonging to the same protected attribute class uncorrelated, using the rate-distortion function. FaRM is able to debias representat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 25 publications
0
4
0
Order By: Relevance
“…In contrast, the separability measure based on the rate-distortion function [12] is more robust to changes in feature dimensionality, as it utilizes the singular value of a feature to measure the distance between inter-class samples. Recent research has applied rate-distortion theory to explain neural network models, optimal feature learning methods, and so on [13].…”
Section: Icaita-2023mentioning
confidence: 99%
“…In contrast, the separability measure based on the rate-distortion function [12] is more robust to changes in feature dimensionality, as it utilizes the singular value of a feature to measure the distance between inter-class samples. Recent research has applied rate-distortion theory to explain neural network models, optimal feature learning methods, and so on [13].…”
Section: Icaita-2023mentioning
confidence: 99%
“…Another line of work (Cheng et al 2020;Dixon et al 2018), use counterfactual data augmentation approaches to debias sentence embeddings. Recently, Chowdhury and Chaturvedi (2022) proposed a debiasing framework that makes representations from same protected attribute class uncorrelated by maximizing their ratedistortion function. Despite showcasing promise in a single domain, these frameworks fail to remain fair for out-ofdistribution data (Barrett et al 2019).…”
Section: Related Workmentioning
confidence: 99%
“…Machine learning systems can often rely on a user's demographic information, like gender, race, and age (protected attributes), encoded in their representations (Elazar and Goldberg 2018) to make decisions, resulting in biased outcomes against certain demographic groups (Mehrabi et al 2021;Shah, Schwartz, and Hovy 2020). Numerous works try to achieve fairness through unawareness (Apfelbaum et al 2010) by debiasing model representations from protected attributes (Blodgett, Green, and O'Connor 2016;Elazar and Goldberg 2018;Elazar et al 2021;Chowdhury and Chaturvedi 2022). However, these techniques are only able to remove in-domain spurious correlations and fail to generalize to new data distributions (Barrett et al 2019).…”
Section: Introductionmentioning
confidence: 99%
“…We subsequently perform backdoor adjustment based on the average treatment effect, utilizing feature reweighting . In TE-D, we leverage the rate-distortion function which controls the number of bits required to encode a set of vector representations (Chowdhury and Chaturvedi, 2022). We minimize the ratedistortion function for a non-linear projection of the features extracted from a biased pretrained model, while simultaneously minimizing the cross-entropy loss of predicting from these projected features.…”
Section: Introductionmentioning
confidence: 99%