2021
DOI: 10.48550/arxiv.2111.08006
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Disparities in Dermatology AI: Assessments Using Diverse Clinical Images

Roxana Daneshjou,
Kailas Vodrahalli,
Weixin Liang
et al.

Abstract: More than 3 billion people lack access to care for skin disease. AI diagnostic tools may aid in early skin cancer detection; however, most models have not been assessed on images of diverse skin tones or uncommon diseases. To address this, we curated the Diverse Dermatology Images (DDI) dataset-the first publicly available, pathologically confirmed images featuring diverse skin tones. We show that state-of-theart dermatology AI models perform substantially worse on DDI, with ROC-AUC dropping 29-40 percent comp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
2

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 9 publications
0
4
0
Order By: Relevance
“…The task is detecting whether a skin lesion is benign or malignant. We use the Inception (Szegedy et al, 2015) model trained on this dataset, which is available from (Daneshjou et al, 2021). Following the setting in (Lucieri et al, 2020), we collect concepts from the Derm7pt (Kawahara et al, 2018) dataset.…”
Section: Methodsmentioning
confidence: 99%
“…The task is detecting whether a skin lesion is benign or malignant. We use the Inception (Szegedy et al, 2015) model trained on this dataset, which is available from (Daneshjou et al, 2021). Following the setting in (Lucieri et al, 2020), we collect concepts from the Derm7pt (Kawahara et al, 2018) dataset.…”
Section: Methodsmentioning
confidence: 99%
“…AI solutions for dermatological tasks require careful data design and accurate data annotation to prevent performance issues [50]. Liang et al note that "a systematic assessment of three computer AI models for diagnosing malignant skin lesions demonstrated that the models all performed substantially worse on lesions appearing on dark skin compared with light skin" [51,52]. These performance disparities did not have a single cause.…”
Section: Domain Knowledgementioning
confidence: 99%
“…ID setting The default paradigm in Machine Learning, both in supervised and unsupervised learning. Although this is the default paradigm, the usual assumption that train and test data come from the same distribution is very strong and almost never true for real-world datasets [9,44,12,27,18].…”
Section: Out-of-distribution Regimesmentioning
confidence: 99%