2021
DOI: 10.1016/j.jid.2020.07.034
|View full text |Cite
|
Sign up to set email alerts
|

Clinically Relevant Vulnerabilities of Deep Machine Learning Systems for Skin Cancer Diagnosis

Abstract: al. Comparison of coronary computed tomography angiography, fractional flow reserve, and perfusion imaging for ischemia diagnosis. J Am Coll Cardiol 2019;73: 161e73.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 18 publications
(15 citation statements)
references
References 9 publications
0
15
0
Order By: Relevance
“…10 Factors such as positioning, rotation, colour balance, and even presence of surgical skin markings have all been shown to influence machine learning algorithm performance and any variability of these factors in datasets warrants further evaluation. [76][77][78] This has implications for the use of watermarked atlas images, which might require processing such as cropping prior to use. Furthermore, only 19% of open access datasets in our review stated whether image processing or adjustment by a reviewer occurred.…”
Section: Reviewmentioning
confidence: 99%
“…10 Factors such as positioning, rotation, colour balance, and even presence of surgical skin markings have all been shown to influence machine learning algorithm performance and any variability of these factors in datasets warrants further evaluation. [76][77][78] This has implications for the use of watermarked atlas images, which might require processing such as cropping prior to use. Furthermore, only 19% of open access datasets in our review stated whether image processing or adjustment by a reviewer occurred.…”
Section: Reviewmentioning
confidence: 99%
“…• It could reveal unknown failure modes of the artificial intelligence system, such as a tendency to produce higher error rates in certain populations, diseases, or settings, or in the presence of specific input data characteristics. 9,11,18 • Before deployment, it can be used to derive a measurable adverse event rate, which can inform how closely safety monitoring and post-deployment auditing should be performed. It can also provide a baseline measurement against which ongoing performance can be benchmarked.…”
Section: Scopingmentioning
confidence: 99%
“…Mismatch or incompatibility of input data used during deployment can arise from various types of dataset shift (including population shift, annotation shift, prevalence shift, manifestation shift, and acquisition shift). 11 Interactions with users and the deployment environment are subject to automation bias, human error, and unintended or intended misuse. 12 Additionally, the reasons for unexpectedly poor performance can be e385 www.thelancet.com/digital-health Vol 4 May 2022 Viewpoint non-obvious even after human inspection, and subtle or even unnoticeable differences in the input data might lead to catastrophic failure.…”
Section: Introductionmentioning
confidence: 99%
“…Publicly available datasets have limitations, including lack of inclusivity. Moreover, standardized methods for clinical image acquisition have not been established in dermatology as they have been for radiology, and variations and inconsistencies in images and image‐capture methods in clinical settings have been shown to affect algorithm performance 7 . Guidelines for the standardized acquisition of clinical/dermoscopic images are needed to develop these comprehensive, inclusive national skin image databases, which can then be used to test, train and validate AI algorithms 2…”
Section: Issues Implications Recommendationsmentioning
confidence: 99%