Background Artificial intelligence (AI) systems performing at radiologist-like levels in the evaluation of digital mammography (DM) would improve breast cancer screening accuracy and efficiency. We aimed to compare the stand-alone performance of an AI system to that of radiologists in detecting breast cancer in DM. Methods Nine multi-reader, multi-case study datasets previously used for different research purposes in seven countries were collected. Each dataset consisted of DM exams acquired with systems from four different vendors, multiple radiologists’ assessments per exam, and ground truth verified by histopathological analysis or follow-up, yielding a total of 2652 exams (653 malignant) and interpretations by 101 radiologists (28 296 independent interpretations). An AI system analyzed these exams yielding a level of suspicion of cancer present between 1 and 10. The detection performance between the radiologists and the AI system was compared using a noninferiority null hypothesis at a margin of 0.05. Results The performance of the AI system was statistically noninferior to that of the average of the 101 radiologists. The AI system had a 0.840 (95% confidence interval [CI] = 0.820 to 0.860) area under the ROC curve and the average of the radiologists was 0.814 (95% CI = 0.787 to 0.841) (difference 95% CI = −0.003 to 0.055). The AI system had an AUC higher than 61.4% of the radiologists. Conclusions The evaluated AI system achieved a cancer detection accuracy comparable to an average breast radiologist in this retrospective setting. Although promising, the performance and impact of such a system in a screening setting needs further investigation.
A three-dimensional ͑3D͒ linear model for digital breast tomosynthesis ͑DBT͒ was developed to investigate the effects of different imaging system parameters on the reconstructed image quality. In the present work, experimental validation of the model was performed on a prototype DBT system equipped with an amorphous selenium ͑a-Se͒ digital mammography detector and filtered backprojection ͑FBP͒ reconstruction methods. The detector can be operated in either full resolution with 85 m pixel size or 2 ϫ 1 pixel binning mode to reduce acquisition time. Twenty-five projection images were acquired with a nominal angular range of Ϯ20°. The images were reconstructed using a slice thickness of 1 mm with 0.085ϫ 0.085 mm in-plane pixel dimension. The imaging performance was characterized by spatial frequency-dependent parameters including a 3D noise power spectrum ͑NPS͒ and in-plane modulation transfer function ͑MTF͒. Scatter-free uniform x-ray images were acquired at four different exposure levels for noise analysis. An aluminum ͑Al͒ edge phantom with 0.2 mm thickness was imaged to measure the in-plane presampling MTF. The measured in-plane MTF and 3D NPS were both in good agreement with the model. The dependence of DBT image quality on reconstruction filters was investigated. It was found that the slice thickness ͑ST͒ filter, a Hanning window to limit the high-frequency components in the slice thickness direction, reduces noise aliasing and improves 3D DQE. An ACR phantom was imaged to investigate the effects of angular range and detector operational modes on reconstructed image quality. It was found that increasing the angular range improves the MTF at low frequencies, resulting in better detection of large-area, low-contrast mass lesions in the phantom. There is a trade-off between noise and resolution for pixel binning and full resolution modes, and the choice of detector mode will depend on radiation dose and the targeted lesion.
While deep convolutional neural networks (CNN) have been successfully applied for 2D image analysis, it is still challenging to apply them to 3D anisotropic volumes, especially when the within-slice resolution is much higher than the between-slice resolution and when the amount of 3D volumes is relatively small. On one hand, direct learning of CNN with 3D convolution kernels suffers from the lack of data and likely ends up with poor generalization; insufficient GPU memory limits the model size or representational power. On the other hand, applying 2D CNN with generalizable features to 2D slices ignores between-slice information. Coupling 2D network with LSTM to further handle the between-slice information is not optimal due to the difficulty in LSTM learning. To overcome the above challenges, we propose a 3D Anisotropic Hybrid Network (AH-Net) that transfers convolutional features learned from 2D images to 3D anisotropic volumes. Such a transfer inherits the desired strong generalization capability for withinslice information while naturally exploiting between-slice information for more effective modelling. The focal loss is further utilized for more effective end-to-end learning. We experiment with the proposed 3D AH-Net on two different medical image analysis tasks, namely lesion detection from a Digital Breast Tomosynthesis volume, and liver and liver tumor segmentation from a Computed Tomography volume and obtain the state-of-the-art results.
Purpose To study the feasibility of automatically identifying normal digital mammography (DM) exams with artificial intelligence (AI) to reduce the breast cancer screening reading workload. Methods and materials A total of 2652 DM exams (653 cancer) and interpretations by 101 radiologists were gathered from nine previously performed multi-reader multi-case receiver operating characteristic (MRMC ROC) studies. An AI system was used to obtain a score between 1 and 10 for each exam, representing the likelihood of cancer present. Using all AI scores between 1 and 9 as possible thresholds, the exams were divided into groups of low- and high likelihood of cancer present. It was assumed that, under the pre-selection scenario, only the high-likelihood group would be read by radiologists, while all low-likelihood exams would be reported as normal. The area under the reader-averaged ROC curve (AUC) was calculated for the original evaluations and for the pre-selection scenarios and compared using a non-inferiority hypothesis. Results Setting the low/high-likelihood threshold at an AI score of 5 (high likelihood > 5) results in a trade-off of approximately halving (− 47%) the workload to be read by radiologists while excluding 7% of true-positive exams. Using an AI score of 2 as threshold yields a workload reduction of 17% while only excluding 1% of true-positive exams. Pre-selection did not change the average AUC of radiologists (inferior 95% CI > − 0.05) for any threshold except at the extreme AI score of 9. Conclusion It is possible to automatically pre-select exams using AI to significantly reduce the breast cancer screening reading workload. Key Points • There is potential to use artificial intelligence to automatically reduce the breast cancer screening reading workload by excluding exams with a low likelihood of cancer. • The exclusion of exams with the lowest likelihood of cancer in screening might not change radiologists’ breast cancer detection performance. • When excluding exams with the lowest likelihood of cancer, the decrease in true-positive recalls would be balanced by a simultaneous reduction in false-positive recalls.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.