2021
DOI: 10.48550/arxiv.2104.02821
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Towards Measuring Fairness in AI: the Casual Conversations Dataset

Abstract: This paper introduces a novel dataset to help researchers evaluate their computer vision and audio models for accuracy across a diverse set of age, genders, apparent skin tones and ambient lighting conditions. Our dataset is composed of 3,011 subjects and contains over 45,000 videos, with an average of 15 videos per person. The videos were recorded in multiple U.S. states with a diverse set of adults in various age, gender and apparent skin tone groups. A key feature is that each subject agreed to participate … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 7 publications
0
5
0
Order By: Relevance
“…In 2021 Facebook AI released a human-annotated dataset consisting of 45,000 videos of 3011 identities for researchers to use to study fairness across a diverse set of ages, genders, and apparent skin tone [115].…”
Section: ) Facebook Datamentioning
confidence: 99%
“…In 2021 Facebook AI released a human-annotated dataset consisting of 45,000 videos of 3011 identities for researchers to use to study fairness across a diverse set of ages, genders, and apparent skin tone [115].…”
Section: ) Facebook Datamentioning
confidence: 99%
“…We also note that Dooley et al [18] includes a fourth dataset, called the Casual Conversations Dataset [37], which we omit from our experiments. We have done this due to a potential interpretation of the data use license agreement which limits modifications and could preclude corrupting the images in their dataset.…”
Section: Datasetsmentioning
confidence: 99%
“…We acknowledge that the common academic datasets which we used to evaluate our research questions (Adience [22], MIAP [72], and UTKFace [90]) are all datasets of images scraped from the web without the informed consent of those whom are depicted. This ethical challenge is one that has plagued the research and computer vision community for the last decade [62,63] and we are excited to see datasets being released which have fully informed consent of the subjects, such as the Casual Conversations Dataset [37]. Unfortunately, this dataset in particular has a rather restrictive license, much more restrictive than similar datasets, which prohibited its use in our study.…”
Section: Ethics Statementmentioning
confidence: 99%
“…Fairness has received much attention in the recent literature on computer vision and deep learning [5,6,16,21,29,36,39,48,49,51,52,55,57]. The most common goal in these works is to enhance fairness by reducing the accuracy disparity of a model between images from different demographic sub-groups.…”
Section: Fairness In Computer Visionmentioning
confidence: 99%