Research using whole slide images (WSIs) of scanned histopathology slides for the development of artificial intelligence (AI) algorithms has increased exponentially over recent years. Glass slides from large retrospective cohorts with patient follow-up data are digitised for the development and validation of AI tools. Such resources, therefore, become very important, with the need to ensure that their quality is of the standard necessary for downstream AI development. However, manual quality control of such large cohorts of WSIs by visual assessment is unfeasible, and whilst quality control AI algorithms exist, these focus on bespoke aspects of image quality, e.g. focus, or use traditional machine-learning methods such as hand-crafted features, which are unable to classify the range of potential image artefacts that should be considered.
In this study, we have trained and validated a multi-task deep neural network to automate the process of quality control of a large retrospective cohort of prostate cases from which glass slides have been scanned several years after production, to determine both the usability of the images for research and the common image artefacts present.
Using a two-layer approach, quality overlays of WSIs were generated from a quality assessment undertaken at patch-level at 5X magnification. From these quality overlays the slide-level quality scores were predicted and then compared to those generated by three specialist urological pathologists, with a Pearson correlation of 0.89 for overall usability (at a diagnostic level), and
0.87 and 0.82 for focus and H&E staining quality scores respectively. We subsequently applied our quality assessment pipeline
to the TCGA prostate cancer cohort and to a colorectal cancer cohort, for comparison.
Our model, designated as PathProfiler, indicates comparable predicted usability of images from the cohorts assessed (86-90%), and perhaps more significantly is able to predicts WSIs that could benefit from re-scanning or re-staining for quality
improvement.
We have shown in this study that AI can be used to automate the process of quality control of large retrospective cohorts to
maximise research outputs and conclusions.