2021
DOI: 10.48550/arxiv.2101.01673
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Characterizing Intersectional Group Fairness with Worst-Case Comparisons

Abstract: MachineLearning or Artificial Intelligence algorithms have gained considerable scrutiny in recent times owing to their propensity towards imitating and amplifying existing prejudices in society. This has led to a niche but growing body of work that identifies and attempts to fix these biases.A first step towards making these algorithms more fair is designing metrics that measure unfairness. Most existing work in this field deals with either a binary view of fairness (protected vs. unprotected groups) or politi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
2
1

Relationship

2
1

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 0 publications
0
5
0
Order By: Relevance
“…Binary metrics cannot be used if intersectional fairness is desired, e.g., fairness between White males and White females. Newer metrics like those proposed by Geyik et al [29] that compare entire population distributions over an unspecified number of subgroups, or attention-based metrics [9,71,73] that also deal with the population distributions, are agnostic to group cardinality, and thus lend themselves to intersectionally fair frameworks [26,30].…”
Section: Fair Rankingmentioning
confidence: 99%
See 1 more Smart Citation
“…Binary metrics cannot be used if intersectional fairness is desired, e.g., fairness between White males and White females. Newer metrics like those proposed by Geyik et al [29] that compare entire population distributions over an unspecified number of subgroups, or attention-based metrics [9,71,73] that also deal with the population distributions, are agnostic to group cardinality, and thus lend themselves to intersectionally fair frameworks [26,30].…”
Section: Fair Rankingmentioning
confidence: 99%
“…We focus on metrics that (1) assess group fairness [23], possibly balanced against secondary objectives, and (2) are capable of dealing with multiple subgroups (i.e., not just binary protected versus unprotected classes). For our analysis, we adopted the definition of a subgroup as a Cartesian product of β‰₯ 2 groups, as defined in Ghosh et al [30]. A subgroup 𝑠𝑔 π‘Ž 1 ....π‘Ž 𝑛 is defined as set containing the intersection of all members who belong to groups 𝑔 π‘Ž 1 through 𝑔 π‘Ž 𝑛 , where π‘Ž 1 , π‘Ž 2 ...π‘Ž 𝑛 are marginal protected attributes like race, gender, etc.…”
Section: Metrics For Ranking Evaluationmentioning
confidence: 99%
“…The use of algorithms to aid critical decision making processes in the government and the industry has attracted commensurate scrutiny from academia, lawmakers and social justice workers in recent times [4,7,71], because ML systems trained on a snapshot of the society has the unintended consequences of learning, propagating and amplifying historical social biases and power dynamics [5,56]. The current research landscape consists of both ML explanation methods and fairness metrics to try and uncover the problems of trained models [8,30,45,59,68], and fairness aware ML algorithms, for instance classification [31,34,37,47], regression [2,9], causal inference [43,49], word embeddings [13,14] and ranking [16,64,72].…”
Section: Algorithmic Fairnessmentioning
confidence: 99%
“…Previous work [19,38,48] point out that it is impossible to satisfy both Classification parity and Calibration metrics at the same time, except for very specific conditions, and therefore context becomes key when picking a metric [6,62]. The statistical limitations extend to group membership limitations -most conventional metrics require groups and subgroups to be discrete variables and cannot work with continuous variables [30], and "confusion matrix based metrics" [52] additionally do not support continuous outputs (which is often the case in problems like regression and recommendation), causing measurement to be severely limited with ad-hoc thresholds that causes interpretation to wildly differ ( Figure 1). Also, most conventional fairness metrics ignore stochasticity -a temporal analysis in [42] showed how the changing of fairness metrics over time, due to data drift, concept drift or otherwise, could actually harm sensitive groups, especially when redressal is based on a measurement from a fixed point in time.…”
Section: Shortcomings Of Existing Fairness Metricsmentioning
confidence: 99%
See 1 more Smart Citation