Objectives The purpose of this study was to build a deep learning model to derive labels from neuroradiology reports and assign these to the corresponding examinations, overcoming a bottleneck to computer vision model development. Methods Reference-standard labels were generated by a team of neuroradiologists for model training and evaluation. Three thousand examinations were labelled for the presence or absence of any abnormality by manually scrutinising the corresponding radiology reports (‘reference-standard report labels’); a subset of these examinations (n = 250) were assigned ‘reference-standard image labels’ by interrogating the actual images. Separately, 2000 reports were labelled for the presence or absence of 7 specialised categories of abnormality (acute stroke, mass, atrophy, vascular abnormality, small vessel disease, white matter inflammation, encephalomalacia), with a subset of these examinations (n = 700) also assigned reference-standard image labels. A deep learning model was trained using labelled reports and validated in two ways: comparing predicted labels to (i) reference-standard report labels and (ii) reference-standard image labels. The area under the receiver operating characteristic curve (AUC-ROC) was used to quantify model performance. Accuracy, sensitivity, specificity, and F1 score were also calculated. Results Accurate classification (AUC-ROC > 0.95) was achieved for all categories when tested against reference-standard report labels. A drop in performance (ΔAUC-ROC > 0.02) was seen for three categories (atrophy, encephalomalacia, vascular) when tested against reference-standard image labels, highlighting discrepancies in the original reports. Once trained, the model assigned labels to 121,556 examinations in under 30 min. Conclusions Our model accurately classifies head MRI examinations, enabling automated dataset labelling for downstream computer vision applications. Key Points • Deep learning is poised to revolutionise image recognition tasks in radiology; however, a barrier to clinical adoption is the difficulty of obtaining large labelled datasets for model training. • We demonstrate a deep learning model which can derive labels from neuroradiology reports and assign these to the corresponding examinations at scale, facilitating the development of downstream computer vision models. • We rigorously tested our model by comparing labels predicted on the basis of neuroradiology reports with two sets of reference-standard labels: (1) labels derived by manually scrutinising each radiology report and (2) labels derived by interrogating the actual images.
ObjectiveAutomatic segmentation of vestibular schwannoma (VS) from routine clinical MRI can improve clinical workflow, facilitate treatment decisions, and assist patient management. Previously, excellent automatic segmentation results were achieved on datasets of standardised MRI images acquired for stereotactic surgery planning. However, diagnostic clinical datasets are generally more diverse and pose a larger challenge to automatic segmentation algorithms. Here, we show that automatic segmentation of VS on such datasets is also possible with high accuracy.MethodsWe acquired a large multi-centre routine clinical (MC-RC) dataset of 168 patients with a single sporadic VS who were referred from 10 medical sites and consecutively seen at a single centre. Up to three longitudinal MRI exams were selected for each patient. Selection rules based on image modality, resolution orientation, and acquisition timepoint were defined to automatically select contrast-enhanced T1-weighted (ceT1w) images (n=130) and T2-weighted images (n=379). Manual ground truth segmentations were obtained in an iterative process in which segmentations were: 1) produced or amended by a specialized company; and 2) reviewed by one of three trained radiologists; and 3) validated by an expert team. Inter- and intra-observer reliability was assessed on a subset of 10 ceT1w and 41 T2w images. The MC-RC dataset was split randomly into 3 nonoverlapping sets for model training, hyperparameter-tuning and testing in proportions 70/10/20%. We applied deep learning to train our VS segmentation model, based on convolutional neural networks (CNN) within the nnU-Net framework.ResultsOur model achieved excellent Dice scores when evaluated on the MC-RC testing set as well as the public testing set. On the MC-RC testing set, Dice scores were 90.8±4.5% for ceT1w, 86.1±11.6% for T2w and 82.3±18.4% for a combined ceT1w+T2w input.ConclusionsWe developed a model for automatic VS segmentation on diverse multi-centre clinical datasets. The results show that the performance of the framework is comparable to that of human annotators. In contrast, a model trained a publicly available dataset acquired for Gamma Knife stereotactic radiosurgery did not perform well on the MC-RC testing set. The application of our model has the potential to greatly facilitate the management of patients in clinical practice. Our pre-trained segmentation models are made available online. Moreover, we are in the process of making the MC-RC dataset publicly available.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.