Training of neural networks for automated diagnosis of pigmented skin lesions is hampered by the small size and lack of diversity of available datasets of dermatoscopic images. We tackle this problem by releasing the HAM10000 (“Human Against Machine with 10000 training images”) dataset. We collected dermatoscopic images from different populations acquired and stored by different modalities. Given this diversity we had to apply different acquisition and cleaning methods and developed semi-automatic workflows utilizing specifically trained neural networks. The final dataset consists of 10015 dermatoscopic images which are released as a training set for academic machine learning purposes and are publicly available through the ISIC archive. This benchmark dataset can be used for machine learning and for comparisons with human experts. Cases include a representative collection of all important diagnostic categories in the realm of pigmented lesions. More than 50% of lesions have been confirmed by pathology, while the ground truth for the rest of the cases was either follow-up, expert consensus, or confirmation by in-vivo confocal microscopy.
Mailings and social media posts of the International Dermoscopy Society were used to recruit targeted groups. The recruitment was focused on medical personell interested in the diagnosis of skin cancer. It is possible that recruitment of raters is influenced by self-selection bias and therefore biased towards the selection of motivated and skilled raters. Skill level was included as a covariate in the interaction experiments. Each rater had to perform multiple screening tests to ensure that the self-reported experience matched actual skills. Because of self selection bias, the generalisability of our results to a less motivated group of readers may be limited.
Ethics oversightEthics review board of the Medical University of Vienna Note that full information on the approval of the study protocol must also be provided in the manuscript.
Background
Evolving dermoscopic terminology motivated us to initiate a new consensus.
Objective
We sought to establish a dictionary of standardized terms.
Methods
We reviewed the medical literature, conducted a survey, and convened a discussion among experts.
Results
Two competitive terminologies exist, a more metaphoric terminology that includes numerous terms and a descriptive terminology based on 5 basic terms. In a survey among members of the International Society of Dermoscopy (IDS) 23.5% (n = 201) participants preferentially use descriptive terminology, 20.1% (n = 172) use metaphoric terminology, and 484 (56.5%) use both. More participants who had been initially trained by metaphoric terminology prefer using descriptive terminology than vice versa (9.7% vs 2.6%, P < .001). Most new terms that were published since the last consensus conference in 2003 were unknown to the majority of the participants. There was uniform consensus that both terminologies are suitable, that metaphoric terms need definitions, that synonyms should be avoided, and that the creation of new metaphoric terms should be discouraged. The expert panel proposed a dictionary of standardized terms taking account of metaphoric and descriptive terms.
Limitations
A consensus seeks a workable compromise but does not guarantee its implementation.
Conclusion
The new consensus provides a revised framework of standardized terms to enhance the consistent use of dermoscopic terminology.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.