2021
DOI: 10.48550/arxiv.2109.09818
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Skin Deep Unlearning: Artefact and Instrument Debiasing in the Context of Melanoma Classification

Abstract: Convolutional Neural Networks have demonstrated dermatologist-level performance in the classification of melanoma and other skin lesions, but prediction irregularities due to biases seen within the training data are an issue that should be addressed before widespread deployment is possible. In this work, we robustly remove bias and spurious variation from an automated melanoma classification pipeline using two leading bias 'unlearning' techniques. We show that the biases introduced by surgical markings and rul… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 30 publications
(127 reference statements)
0
2
0
Order By: Relevance
“…The situation is even more complex in medical image analysis for specialities such as radiology (National Lung Screening Trial, MIMIC-CXR-JPG [42], CheXpert [43]) or dermatology (Melanoma detection for skin cancer, HAM10000 database [44]), where biased datasets are provided for medical applications. Indeed, under-represented populations in some datasets lead to a critical drop in accuracy, for instance in skin cancer detection, as in [45,46], or for general research in medicine [47] and references therein.…”
Section: Improperly Sampled Training Datamentioning
confidence: 99%
“…The situation is even more complex in medical image analysis for specialities such as radiology (National Lung Screening Trial, MIMIC-CXR-JPG [42], CheXpert [43]) or dermatology (Melanoma detection for skin cancer, HAM10000 database [44]), where biased datasets are provided for medical applications. Indeed, under-represented populations in some datasets lead to a critical drop in accuracy, for instance in skin cancer detection, as in [45,46], or for general research in medicine [47] and references therein.…”
Section: Improperly Sampled Training Datamentioning
confidence: 99%
“…achieved the best classification results with an AUC of 0.911 and balanced multiclass accuracy of 0.831 on three skin cancer classification tasks of ISIC-2017 by using an ensemble of ResNet-50 networks on normalized images ( 72 ). used ensemble learning with a stacking scheme and obtained the classification results with an accuracy of 0.885 and an AUC of 0.983 in the ISIC-2018 competition ( 73 ). employed two bias removal techniques, “Learning Not to Learn” (LNTL) and “Turning a Blind Eye” (TABE), to alleviate irregularities in model predictions and spurious changes in melanoma images.…”
Section: Dermatological Images and Datasetsmentioning
confidence: 99%