Context.—
Automated prostate cancer detection using machine learning technology has led to speculation that pathologists will soon be replaced by algorithms. This review covers the development of machine learning algorithms and their reported effectiveness specific to prostate cancer detection and Gleason grading.
Objective.—
To examine current algorithms regarding their accuracy and classification abilities. We provide a general explanation of the technology and how it is being used in clinical practice. The challenges to the application of machine learning algorithms in clinical practice are also discussed.
Data Sources.—
The literature for this review was identified and collected using a systematic search. Criteria were established prior to the sorting process to effectively direct the selection of studies. A 4-point system was implemented to rank the papers according to their relevancy. For papers accepted as relevant to our metrics, all cited and citing studies were also reviewed. Studies were then categorized based on whether they implemented binary or multi-class classification methods. Data were extracted from papers that contained accuracy, area under the curve (AUC), or κ values in the context of prostate cancer detection. The results were visually summarized to present accuracy trends between classification abilities.
Conclusions.—
It is more difficult to achieve high accuracy metrics for multiclassification tasks than for binary tasks. The clinical implementation of an algorithm that can assign a Gleason grade to clinical whole slide images (WSIs) remains elusive. Machine learning technology is currently not able to replace pathologists but can serve as an important safeguard against misdiagnosis.