2016
DOI: 10.48550/arxiv.1610.07183
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

How to be Fair and Diverse?

Abstract: Due to the recent cases of algorithmic bias in datadriven decision-making, machine learning methods are being put under the microscope in order to understand the root cause of these biases and how to correct them. Here, we consider a basic algorithmic task that is central in machine learning: subsampling from a large data set. Subsamples are used both as an end-goal in data summarization (where fairness could either be a legal, political or moral requirement) and to train algorithms (where biases in the sample… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
15
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 11 publications
(15 citation statements)
references
References 8 publications
0
15
0
Order By: Relevance
“…Finally, the analysis of submodular valuations ties in with existing works on diversity in various elds from biology to machine learning (see, e.g. Celis et al [2016], Jost [2006]). A popular measurement for how diverse a solution is is to apply one of several concave functions called diversity indices to the proportions of the di erent entities/a ributes (with respect to which we wish to be diverse) in the solution, e.g.…”
Section: Discussionmentioning
confidence: 88%
“…Finally, the analysis of submodular valuations ties in with existing works on diversity in various elds from biology to machine learning (see, e.g. Celis et al [2016], Jost [2006]). A popular measurement for how diverse a solution is is to apply one of several concave functions called diversity indices to the proportions of the di erent entities/a ributes (with respect to which we wish to be diverse) in the solution, e.g.…”
Section: Discussionmentioning
confidence: 88%
“…As a result, the development of frameworks for understanding data subsampling has emerged as its own subfield tangent to fair machine learning. One such framework is determinantal point process (k-DPP), which proposes quantifiable measures for combinatorial subgroup diversity (via Shannon Entropy) and geometric feature diversity (via measuring the volume of the k-dimensional feature space) 93 . In tandem with resampling is the problem of optimal data (or resource) allocation in operationalizing dataset collection in statistical surveys, as well as game-theoretic frameworks for understanding the influence of individual data points via Shapley values, which may refine resampling techniques to developing fair and diverse training datasets [94][95][96][97] .…”
Section: Preprocessingmentioning
confidence: 99%
“…Fair machine learning algorithms need to adopt/create specific fairness definitions that fit into context [87,111,119,143,31]. Common methods in fair classification include blinding [32,93], causal methods [80,108], transformation [74,64,208], sampling and subgroup analysis [37,65], adversarial learning [75,211,175], reweighing [106,31], and regularization and constraint optimization [111,17].…”
Section: Fairnessmentioning
confidence: 99%