ObjectiveDeep learning has become a promising approach for automated support for clinical diagnosis. When medical data samples are limited, collaboration among multiple institutions is necessary to achieve high algorithm performance. However, sharing patient data often has limitations due to technical, legal, or ethical concerns. In this study, we propose methods of distributing deep learning models as an attractive alternative to sharing patient data.MethodsWe simulate the distribution of deep learning models across 4 institutions using various training heuristics and compare the results with a deep learning model trained on centrally hosted patient data. The training heuristics investigated include ensembling single institution models, single weight transfer, and cyclical weight transfer. We evaluated these approaches for image classification in 3 independent image collections (retinal fundus photos, mammography, and ImageNet).ResultsWe find that cyclical weight transfer resulted in a performance that was comparable to that of centrally hosted patient data. We also found that there is an improvement in the performance of cyclical weight transfer heuristic with a high frequency of weight transfer.ConclusionsWe show that distributing deep learning models is an effective alternative to sharing patient data. This finding has implications for any collaborative deep learning study.
IMPORTANCE Mammography screening currently relies on subjective human interpretation. Artificial intelligence (AI) advances could be used to increase mammography screening accuracy by reducing missed cancers and false positives. OBJECTIVE To evaluate whether AI can overcome human mammography interpretation limitations with a rigorous, unbiased evaluation of machine learning algorithms. DESIGN, SETTING, AND PARTICIPANTS In this diagnostic accuracy study conducted between September 2016 and November 2017, an international, crowdsourced challenge was hosted to foster AI algorithm development focused on interpreting screening mammography. More than 1100 participants comprising 126 teams from 44 countries participated. Analysis began November 18, 2016. MAIN OUTCOMES AND MEASUREMENTS Algorithms used images alone (challenge 1) or combined images, previous examinations (if available), and clinical and demographic risk factor data (challenge 2) and output a score that translated to cancer yes/no within 12 months. Algorithm accuracy for breast cancer detection was evaluated using area under the curve and algorithm specificity compared with radiologists' specificity with radiologists' sensitivity set at 85.9% (United States) and 83.9% (Sweden). An ensemble method aggregating top-performing AI algorithms and radiologists' recall assessment was developed and evaluated. RESULTS Overall, 144 231 screening mammograms from 85 580 US women (952 cancer positive Յ12 months from screening) were used for algorithm training and validation. A second independent validation cohort included 166 578 examinations from 68 008 Swedish women (780 cancer positive). The top-performing algorithm achieved an area under the curve of 0.858 (United States) and 0.903 (Sweden) and 66.2% (United States) and 81.2% (Sweden) specificity at the radiologists' sensitivity, lower than community-practice radiologists' specificity of 90.5% (United States) and 98.5% (Sweden). Combining top-performing algorithms and US radiologist assessments resulted in a higher area under the curve of 0.942 and achieved a significantly improved specificity (92.0%) at the same sensitivity. CONCLUSIONS AND RELEVANCE While no single AI algorithm outperformed radiologists, an ensemble of AI algorithms combined with radiologist assessment in a single-reader screening environment improved overall accuracy. This study underscores the potential of using machine (continued)
Background: Detecting and segmenting brain metastases is a tedious and time-consuming task for many radiologists, particularly with the growing use of multisequence 3D imaging. Purpose: To demonstrate automated detection and segmentation of brain metastases on multisequence MRI using a deep-learning approach based on a fully convolution neural network (CNN). Study Type: Retrospective. Population: In all, 156 patients with brain metastases from several primary cancers were included. Field Strength: 1.5T and 3T. [Correction added on May 24, 2019, after first online publication: In the preceding sentence, the first field strength listed was corrected.] Sequence: Pretherapy MR images included pre-and postgadolinium T 1 -weighted 3D fast spin echo (CUBE), postgadolinium T 1 -weighted 3D axial IR-prepped FSPGR (BRAVO), and 3D CUBE fluid attenuated inversion recovery (FLAIR). Assessment: The ground truth was established by manual delineation by two experienced neuroradiologists. CNN training/development was performed using 100 and 5 patients, respectively, with a 2.5D network based on a GoogLeNet architecture. The results were evaluated in 51 patients, equally separated into those with few (1-3), multiple (4-10), and many (>10) lesions. Statistical Tests: Network performance was evaluated using precision, recall, Dice/F1 score, and receiver operating characteristic (ROC) curve statistics. For an optimal probability threshold, detection and segmentation performance was assessed on a per-metastasis basis. The Wilcoxon rank sum test was used to test the differences between patient subgroups. Results: The area under the ROC curve (AUC), averaged across all patients, was 0.98 AE 0.04. The AUC in the subgroups was 0.99 AE 0.01, 0.97 AE 0.05, and 0.97 AE 0.03 for patients having 1-3, 4-10, and >10 metastases, respectively. Using an average optimal probability threshold determined by the development set, precision, recall, and Dice score were 0.79 AE 0.20, 0.53 AE 0.22, and 0.79 AE 0.12, respectively. At the same probability threshold, the network showed an average falsepositive rate of 8.3/patient (no lesion-size limit) and 3.4/patient (10 mm 3 lesion size limit). Data Conclusion: A deep-learning approach using multisequence MRI can automatically detect and segment brain metastases with high accuracy. Level of Evidence: 3 Technical Efficacy Stage: 2 View this article online at wileyonlinelibrary.com.
hest radiography represents the initial imaging test for important thoracic abnormalities ranging from pneumonia to lung cancer. Unfortunately, as the ratio of image volume to qualified radiologists has continued to increase, interpretation delays and backlogs have demonstrably reduced the quality of care in large health organizations, such as the U.K. National Health Service (1) and the U.S. Department of Veterans Affairs (2). The situation is even worse in resource-poor areas, where radiology services are extremely scarce (3,4). In this light, automated image analysis represents an appealing mechanism to improve throughput while maintaining, and potentially improving, quality of care. The remarkable success of machine learning techniques such as convolutional neural networks (CNNs) for image classification tasks makes these algorithms a natural choice for automated radiograph analysis (5,6), and they have already performed well for tasks such as skeletal bone age assessment (7-9), lung nodule classification (10), tuberculosis detection (11), high-throughput image retrieval (12,13), and evaluation of endotracheal tube positioning (14). However, a major challenge when applying such techniques to chest radiography at scale has been the limited availability of the large labeled data sets generally required to achieve high levels of performance (6). In response, the U.S. National Institutes of Health released a public chest radiograph database containing 112 120 frontal view images with noisy multiclass labels extracted from associated text reports (15). This study also showed the challenges of achieving reliable multiclass thoracic diagnosis prediction with chest radiographs (15), potentially limiting the clinical utility of resultant classifiers. Further, this method of disease-specific computer-assisted diagnosis may not ultimately be beneficial to the interpreting clinician (16).
Despite the relative ease of locating organs in the human body, automated organ segmentation has been hindered by the scarcity of labeled training data. Due to the tedium of labeling organ boundaries, most datasets are limited to either a small number of cases or a single organ. Furthermore, many are restricted to specific imaging conditions unrepresentative of clinical practice. To address this need, we developed a diverse dataset of 140 CT scans containing six organ classes: liver, lungs, bladder, kidney, bones and brain. For the lungs and bones, we expedited annotation using unsupervised morphological segmentation algorithms, which were accelerated by 3D Fourier transforms. Demonstrating the utility of the data, we trained a deep neural network which requires only 4.3 s to simultaneously segment all the organs in a case. We also show how to efficiently augment the data to improve model generalization, providing a GPU library for doing so. We hope this dataset and code, available through TCIA, will be useful for training and evaluating organ segmentation models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.