2019
DOI: 10.48550/arxiv.1908.09635
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Survey on Bias and Fairness in Machine Learning

Abstract: With the widespread use of AI systems and applications in our everyday lives, it is important to take fairness issues into consideration while designing and engineering these types of systems. Such systems can be used in many sensitive environments to make important and life-changing decisions; thus, it is crucial to ensure that the decisions do not reflect discriminatory behavior toward certain groups or populations. We have recently seen work in machine learning, natural language processing, and deep learnin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
335
0
5

Year Published

2020
2020
2022
2022

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 229 publications
(341 citation statements)
references
References 63 publications
1
335
0
5
Order By: Relevance
“…Several formulations of fairness have been proposed in the ML community [18], [5]. The first and most basic formulation is to ignore sensitive attributes with attribute unaware fairness: if a model cannot use a sensitive attribute in its prediction, the model is fair [5].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Several formulations of fairness have been proposed in the ML community [18], [5]. The first and most basic formulation is to ignore sensitive attributes with attribute unaware fairness: if a model cannot use a sensitive attribute in its prediction, the model is fair [5].…”
Section: Related Workmentioning
confidence: 99%
“…We select demographic parity [5] as our fairness criteria. Informally, demographic parity is satisfied if the output of the model is not dependent on a given sensitive attribute [18]. Formally, we define demographic parity fairness criteria on the link prediction problem as follows.…”
Section: Definition 1 (Graph Representation Learning [? ]mentioning
confidence: 99%
“…Manifestations of different kinds of biases have been shown to exist in various components used to develop NLP and ML systems, from training data to pre-trained models to algorithms and resources [9,12,30,31,39]. Although several papers discussed various methodologies to de-bias word embedding models, these techniques have been scrutinized on several occasions [4,18].…”
Section: Word Embedding Biasmentioning
confidence: 99%
“…This means that if the ML algorithms had been trained on the modified training data, it would not have exhibited the unexpected or undesirable behavior or would have exhibited this behavior to a lesser degree. Explanations generated by our framework, which complement existing approaches in XAI, are crucial for helping system developers and ML practitioners to debug ML algorithms for data errors and bias in training data, such as measurement errors and misclassifications [35,42,94], data imbalance [27], missing data and selection bias [29,62,63], covariate shift [74,82], technical biases introduced during data preparation [85], and poisonous data points injected through adversarial attacks [36,43,65,83]. It is known in the algorithmic fairness literature that information about the source of bias is critically needed to build fair ML algorithms because no current bias mitigation solution fits all situations [27,31,36,82,94].…”
Section: Introductionmentioning
confidence: 99%
“…More compact and coherent descriptions are needed. Furthermore, sources of bias and discrimination in training data are typically not randomly distributed across different sub-populations; rather they manifest systematic errors in data collection, selection, feature engineering, and curation [29,35,42,62,63,70,94]. That is, more often than not, certain cohesive subsets of training data are responsible for bias.…”
Section: Introductionmentioning
confidence: 99%