Supplemental material is available for this article. Keywords: Mammography, Screening, Convolutional Neural Network (CNN) Published under a CC BY 4.0 license. See also the commentary by Cadrin-Chênevert in this issue.
Background: Artificial intelligence (AI) readers, derived from applying deep learning models to medical image analysis, hold great promise for improving population breast cancer screening. However, previous evaluations of AI readers for breast cancer screening have mostly been conducted using cancer-enriched cohorts and have lacked assessment of the potential use of AI readers alongside radiologists in multi-reader screening programs. Here, we present a new AI reader for detecting breast cancer from mammograms in a large-scale population screening setting, and a novel analysis of the potential for human-AI reader collaboration in a well-established, high-performing population screening program. We evaluated the performance of our AI reader and AI-integrated screening scenarios using a two-year, real-world, population dataset from Victoria, Australia, a screening program in which two radiologists independently assess each episode and disagreements are arbitrated by a third radiologist. Methods: We used a retrospective full-field digital mammography image and non-image dataset comprising 808,318 episodes, 577,576 clients and 3,404,326 images in the period 2013 to 2019. Screening episodes from 2016, 2017 and 2018 were sequential population cohorts containing 752,609 episodes, 565,087 clients and 3,169,322 images. The dataset was split by screening date into training, development, and testing sets. All episodes from 2017 and 2018 were allocated to the testing set (509,109 episodes; 3,651 screen-detected cancer episodes). Eight distinct AI models were trained on subsets of the training set (which included a validation set) and combined into our ensemble AI reader. Operating points were selected using the development set. We evaluated our AI reader on our testing set and on external datasets previously unseen by our models. Findings: The AI reader outperformed the mean individual radiologist on this large retrospective testing dataset with an area under the receiver operator characteristic curve of 0.92 (95% CI 0.91, 0.92). The AI reader generalised well across screening round, client demographics, device manufacturer and cancer type, and achieved state-of-the-art performance on external datasets compared to recently published AI readers. Our simulations of AI-integrated screening scenarios demonstrated that a reader-replacement human-AI collaborative system could achieve better sensitivity and specificity (82.6%, 96.1%) compared to the current two-reader consensus system (79.9%, 96.0%), with reduced human reading workload and cost. Our band-pass AI-integrated scenario also enabled both higher sensitivity and specificity (80.6%, 96.2%) with larger reductions in human reading workload and cost. Interpretation: This study demonstrated that human-AI collaboration in a population breast cancer screening program has potential to improve accuracy and lower radiologist workload and costs in real world screening programs. The next stage of validation is to undertake prospective studies that can also assess the effects of the AI systems on human performance and behaviour.
In this paper, we study the problem of extracting trends from time series data involving missing values. In particular, we investigate a general class of procedures that impute the missing data and then extract trends using seasonal-trend decomposition based on loess (STL), where loess stands for locally weighted smoothing, a popular tool for describing the regression relationship between two variables by a smooth curve. We refer to them as the imputation-STL procedures. Two results are obtained in this paper. First, we settle a theoretical issue, namely the connection between imputation error and the overall error from estimating the trend. Specifically, we derive the bounds for the overall error in terms of the imputation error. This subsequently facilitates the error analysis of any imputation-STL procedure and justifies its use in practice. Second, we investigate loess-STL, a particular imputation-STL procedure with the imputation also being performed using loess. Through both theoretical arguments and simulation results, we show that loess-STL has the capacity of handling a high proportion of missing data and providing reliable trend estimates if the underlying trend is smooth and the missing data are dispersed over the time series. In addition to mathematical derivations and simulation study, we apply our loess-STL procedure to profile radiosonde records of upper air temperature at 22 Antarctic research stations covering the past 50 years. For purpose of illustration, we present in this paper only the results for Novolazaravskaja station which has temperature records with more than 8.4% dispersed missing values at 8 pressure levels from October/1969 to March/2011.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.