(1) Background: Recent studies report high accuracies when using machine learning (ML) algorithms to classify prostate cancer lesions on publicly available datasets. However, it is unknown if these trained models generalize well to data from different institutions. (2) Methods: This was a retrospective study using multi-parametric Magnetic Resonance Imaging (mpMRI) data from our institution (63 mpMRI lesions) and the ProstateX-2 challenge, a publicly available annotated image set (112 mpMRI lesions). Residual Neural Network (ResNet) algorithms were trained to classify lesions as high-risk (hrPCA) or low-risk/benign. Models were trained on (a) ProstateX-2 data, (b) local institutional data, and (c) combined ProstateX-2 and local data. The models were then tested on (a) ProstateX-2, (b) local and (c) combined ProstateX-2 and local data. (3) Results: Models trained on either local or ProstateX-2 image data had high Area Under the ROC Curve (AUC)s (0.82–0.98) in the classification of hrPCA when tested on their own respective populations. AUCs decreased significantly (0.23–0.50, p < 0.01) when models were tested on image data from the other institution. Models trained on image data from both institutions re-achieved high AUCs (0.83–0.99). (4) Conclusions: Accurate prostate cancer classification models trained on single-institutional image data performed poorly when tested on outside-institutional image data. Heterogeneous multi-institutional training image data will likely be required to achieve broadly applicable mpMRI models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.