Background: Mammographic density improves the accuracy of breast cancer risk models. However, the use of breast density is limited by subjective assessment, variation across radiologists, and restricted data. A mammography-based deep learning (DL) model may provide more accurate risk prediction. Purpose:To develop a mammography-based DL breast cancer risk model that is more accurate than established clinical breast cancer risk models. Materials and Methods:This retrospective study included 88 994 consecutive screening mammograms in 39 571 women between January 1, 2009, and December 31, 2012. For each patient, all examinations were assigned to either training, validation, or test sets, resulting in 71 689, 8554, and 8751 examinations, respectively. Cancer outcomes were obtained through linkage to a regional tumor registry. By using risk factor information from patient questionnaires and electronic medical records review, three models were developed to assess breast cancer risk within 5 years: a risk-factor-based logistic regression model (RF-LR) that used traditional risk factors, a DL model (image-only DL) that used mammograms alone, and a hybrid DL model that used both traditional risk factors and mammograms. Comparisons were made to an established breast cancer risk model that included breast density version 8 [TC]). Model performance was compared by using areas under the receiver operating characteristic curve (AUCs) with DeLong test (P , .05). Results:The test set included 3937 women, aged 56.20 years 6 10.04. Hybrid DL and image-only DL showed AUCs of 0.70 (95% confidence interval [CI]: 0.66, 0.75) and 0.68 (95% CI: 0.64, 0.73), respectively. RF-LR and TC showed AUCs of 0.67 (95% CI: 0.62, 0.72) and 0.62 (95% CI: 0.57, 0.66), respectively. Hybrid DL showed a significantly higher AUC (0.70) than TC (0.62; P , .001) and RF-LR (0.67; P = .01). Conclusion:Deep learning models that use full-field mammograms yield substantially improved risk discrimination compared with the Tyrer-Cuzick (version 8) model.
Improved breast cancer risk models enable targeted screening strategies that achieve earlier detection and less screening harm than existing guidelines. To bring deep learning risk models to clinical practice, we need to further refine their accuracy, validate them across diverse populations, and demonstrate their potential to improve clinical workflows. We developed Mirai, a mammography-based deep learning model designed to predict risk at multiple timepoints, leverage potentially missing risk factor information, and produce predictions that are consistent across mammography machines. Mirai was trained on a large dataset from Massachusetts General Hospital (MGH) in the United States and tested on held-out test sets from MGH, Karolinska University Hospital in Sweden, and Chang Gung Memorial Hospital (CGMH) in Taiwan, obtaining C-indices of 0.76 (95% confidence interval, 0.74 to 0.80), 0.81 (0.79 to 0.82), and 0.79 (0.79 to 0.83), respectively. Mirai obtained significantly higher 5-year ROC AUCs than the Tyrer-Cuzick model (P < 0.001) and prior deep learning models Hybrid DL (P < 0.001) and Image-Only DL (P < 0.001), trained on the same dataset. Mirai more accurately identified high-risk patients than prior methods across all datasets. On the MGH test set, 41.5% (34.4 to 48.5) of patients who would develop cancer within 5 years were identified as high risk, compared with 36.1% (29.1 to 42.9) by Hybrid DL (P = 0.02) and 22.9% (15.9 to 29.6) by the Tyrer-Cuzick model (P < 0.001).
ammographic breast density can mask cancers at mammography and is an independent risk factor for breast cancer (1-3). Legislation mandating patients be notified of mammographic breast density has passed in more than 30 states, and a federal bill is under consideration. Details of state legislation vary, but most states require direct reporting to the patient that breast density can mask cancers at mammography and that the patient may benefit from additional testing. Qualitative assessment of mammographic breast density is subjective and varies widely between radiologists (4-10). In a study of 83 radiologists who assessed breast density, Sprague et al (4) found extreme variation in qualitative density assessment per the Breast Imaging Reporting and Data System (BI-RADS), with 6%-85% of mammograms assessed as either heterogeneously or extremely dense depending on radiologist interpretation. In a study of 34 radiologists, the intraradiologist agreement of density assessments among women who underwent two examinations varied from 62% to 87% (6). Commercially available methods for automated assessment of breast density do exist; however, they yield mixed results in agreement with expert qualitative density assessments, with k scores of 0.32-0.61 (11,12). These methods tend to result in over-or underreporting of breast density when compared with qualitative assessment by radiologists (11,13). A recent study found significant differences in density assessments in the same 4170 women with two software programs (Volpara, Volpara Solutions, Wellington, New Zealand; Quantra, Hologic, Bedford, Mass), with the software programs showing 37% and 51%, respectively, of women had dense breast tissue. In the same set of mammograms, radiologists determined 43% of the women had dense breast tissue (13). Deep learning (DL) has been gaining traction in radiology (12,14-17). Specifically, there has been preliminary work with DL methods to assess breast density (12,18); however, none of these techniques have been implemented in clinical practice, raising questions about clinical acceptance by practicing radiologists and the effect on patient care. In contrast, our purpose was to develop a DL algorithm we could use to reliably assess breast density and to measure the acceptance of its predictions in real-time clinical practice. We hypothesize that DL models can be applied to assess breast density at the same level as experienced breast imagers and that they can be accepted into routine clinical practice.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.