Purpose
To evaluate the test performance of the QUARTZ (QUantitative Analysis of Retinal vessel Topology and siZe) software in detecting retinal features from retinal images captured by health care professionals in a Danish high street optician chain, compared with test performance from other large population studies (i.e., UK Biobank) where retinal images were captured by non-experts.
Method
The dataset FOREVERP (Finding Ophthalmic Risk and Evaluating the Value of Eye exams and their predictive Reliability, Pilot) contains retinal images obtained from a Danish high street optician chain. The QUARTZ algorithm utilizes both image processing and machine learning methods to determine retinal image quality, vessel segmentation, vessel width, vessel classification (arterioles or venules), and optic disc localization. Outcomes were evaluated by metrics including sensitivity, specificity, and accuracy and compared to human expert ground truths.
Results
QUARTZ’s performance was evaluated on a subset of 3,682 images from the FOREVERP database. 80.55% of the FOREVERP images were labelled as being of adequate quality compared to 71.53% of UK Biobank images, with a vessel segmentation sensitivity of 74.64% and specificity of 98.41% (FOREVERP) compared with a sensitivity of 69.12% and specificity of 98.88% (UK Biobank). The mean (± standard deviation) vessel width of the ground truth was 16.21 (4.73) pixels compared to that predicted by QUARTZ of 17.01 (4.49) pixels, resulting in a difference of -0.8 (1.96) pixels. The differences were stable across a range of vessels. The detection rate for optic disc localisation was similar for the two datasets.
Conclusion
QUARTZ showed high performance when evaluated on the FOREVERP dataset, and demonstrated robustness across datasets, providing validity to direct comparisons and pooling of retinal feature measures across data sources.