“…While the selection of lesions from the Dermofit dataset does not present an extreme imbalance between benign and malignant tumours (≈29:20), when comparing between individual samples, this imbalance increases greatly (≈97:13 in the worst of cases). From this perspective, the present study chose to use evaluation metrics less susceptible to changes in sample balance [ 55 ], namely, Accuracy, Precision, Recall, the F1 Score, and the Area Under the precision–recall Curve (AUC). Each of these metrics, except for AUC, were calculated using confusion matrices, measuring the ratio of correctly classified individuals (True Positive & True Negative), as well as miss-classified individuals (False Positive & False Negative).…”