2020
DOI: 10.1145/3392866
|View full text |Cite
|
Sign up to set email alerts
|

How We've Taught Algorithms to See Identity: Constructing Race and Gender in Image Databases for Facial Analysis

Abstract: Race and gender have long sociopolitical histories of classification in technical infrastructures-from the passport to social media. Facial analysis technologies are particularly pertinent to understanding how identity is operationalized in new technical systems. What facial analysis technologies can do is determined by the data available to train and evaluate them with. In this study, we specifically focus on this data by examining how race and gender are defined and annotated in image databases used for faci… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
74
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 128 publications
(75 citation statements)
references
References 77 publications
1
74
0
Order By: Relevance
“…Our approach and exposition is heavily inspired by [47], who used an ensemble of classifiers to study dataset bias and cross-dataset generalization in commonly used datasets for object detection and classification, noting substantial differences in the representation of nominally similar categories across datasets. We draw on the survey of computer vision datasets in [42], who enumerate the diverse and ill-defined array of racial classification schemes and definitions in computer vision datasets. Geiger et.al [16] find significant variability in the quality of human annotations in a case study of papers on tweet classification.…”
Section: Dataset Auditsmentioning
confidence: 99%
See 4 more Smart Citations
“…Our approach and exposition is heavily inspired by [47], who used an ensemble of classifiers to study dataset bias and cross-dataset generalization in commonly used datasets for object detection and classification, noting substantial differences in the representation of nominally similar categories across datasets. We draw on the survey of computer vision datasets in [42], who enumerate the diverse and ill-defined array of racial classification schemes and definitions in computer vision datasets. Geiger et.al [16] find significant variability in the quality of human annotations in a case study of papers on tweet classification.…”
Section: Dataset Auditsmentioning
confidence: 99%
“…Racial categories are used without being defined or only loosely and nebulously defined [42] -researchers either take the existence and definitions of racial categories as a given that does not have to be justified, or adopt a "I know it when I see it" approach to racial categories. Given that the categories are allusions to both geographic origin and physical characteristics, it is understandable that deeper discussion of these categories is avoided because it veers into unpleasant territory.…”
Section: Racial Categories 31 Usage In Fair Computer Visionmentioning
confidence: 99%
See 3 more Smart Citations