Background
Double reading (DR) in screening mammography increases cancer detection and lowers recall rates, but has sustainability challenges due to workforce shortages. Artificial intelligence (AI) as an independent reader (IR) in DR may provide a cost-effective solution with the potential to improve screening performance. Evidence for AI to generalise across different patient populations, screening programmes and equipment vendors, however, is still lacking.
Methods
This retrospective study simulated DR with AI as an IR, using data representative of real-world deployments (275,900 cases, 177,882 participants) from four mammography equipment vendors, seven screening sites, and two countries. Non-inferiority and superiority were assessed for relevant screening metrics.
Results
DR with AI, compared with human DR, showed at least non-inferior recall rate, cancer detection rate, sensitivity, specificity and positive predictive value (PPV) for each mammography vendor and site, and superior recall rate, specificity, and PPV for some. The simulation indicates that using AI would have increased arbitration rate (3.3% to 12.3%), but could have reduced human workload by 30.0% to 44.8%.
Conclusions
AI has potential as an IR in the DR workflow across different screening programmes, mammography equipment and geographies, substantially reducing human reader workload while maintaining or improving standard of care.
Trial registration
ISRCTN18056078 (20/03/2019; retrospectively registered).
Screening mammography with two human readers increases cancer detection and lowers recall rates, but high resource requirements and a shortage of qualified readers make double reading unsustainable in many countries. The use of AI as an independent reader may yield more objective, accurate and outcome-based screening. Clinical validation of AI requires large-scale, multi-site, multi-vendor studies on unenriched cohorts.This retrospective study evaluated the performance of the Mia™ version 2.0.1 AI system from Kheiron Medical Technologies on an unenriched sample (275,900 cases from 177,882 participants) collected across seven screening sites in two countries and four hardware vendors, and is representative of a real-world screening population over 10 years. Performance was determined for standalone AI and simulated double reading to assess non-inferiority and superiority on relevant screening metrics.Standalone AI showed superiority on sensitivity and non-inferiority on specificity while detecting 29.7% of cancers found within three years after screening, and 29.8% of missed interval cancers. Double reading with AI was at least non-inferior compared to human double reading at every metric, with superiority for recall rate, specificity and positive predictive value (PPV). AI as an independent reader reduced the workload, but increased arbitration rate from 3.3% to 12.3%. Applying the AI system under investigation would have reduced the overall number of human reads required by 44.8%. The recall rate was reduced by a relative 4.1%, suggesting there could be fewer follow-up procedures, reduced stress for patients, and less administrative and clinical work.Using the AI system as an independent reader maintains the standard of care of double reading, detects cancers missed by human readers, while automating a substantial part of the workflow, and could therefore bring significant clinical and operational benefits.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.