Background COVID-19 is a rapidly emerging respiratory disease caused by SARS-CoV-2. Due to the rapid human-to-human transmission of SARS-CoV-2, many health care systems are at risk of exceeding their health care capacities, in particular in terms of SARS-CoV-2 tests, hospital and intensive care unit (ICU) beds, and mechanical ventilators. Predictive algorithms could potentially ease the strain on health care systems by identifying those who are most likely to receive a positive SARS-CoV-2 test, be hospitalized, or admitted to the ICU. Objective The aim of this study is to develop, study, and evaluate clinical predictive models that estimate, using machine learning and based on routinely collected clinical data, which patients are likely to receive a positive SARS-CoV-2 test or require hospitalization or intensive care. Methods Using a systematic approach to model development and optimization, we trained and compared various types of machine learning models, including logistic regression, neural networks, support vector machines, random forests, and gradient boosting. To evaluate the developed models, we performed a retrospective evaluation on demographic, clinical, and blood analysis data from a cohort of 5644 patients. In addition, we determined which clinical features were predictive to what degree for each of the aforementioned clinical tasks using causal explanations. Results Our experimental results indicate that our predictive models identified patients that test positive for SARS-CoV-2 a priori at a sensitivity of 75% (95% CI 67%-81%) and a specificity of 49% (95% CI 46%-51%), patients who are SARS-CoV-2 positive that require hospitalization with 0.92 area under the receiver operator characteristic curve (AUC; 95% CI 0.81-0.98), and patients who are SARS-CoV-2 positive that require critical care with 0.98 AUC (95% CI 0.95-1.00). Conclusions Our results indicate that predictive models trained on routinely collected clinical data could be used to predict clinical pathways for COVID-19 and, therefore, help inform care and prioritize resources.
Privacy concerns around sharing personally identifiable information are a major barrier to data sharing in medical research. In many cases, researchers have no interest in a particular individual’s information but rather aim to derive insights at the level of cohorts. Here, we utilise generative adversarial networks (GANs) to create medical imaging datasets consisting entirely of synthetic patient data. The synthetic images ideally have, in aggregate, similar statistical properties to those of a source dataset but do not contain sensitive personal information. We assess the quality of synthetic data generated by two GAN models for chest radiographs with 14 radiology findings and brain computed tomography (CT) scans with six types of intracranial haemorrhages. We measure the synthetic image quality by the performance difference of predictive models trained on either the synthetic or the real dataset. We find that synthetic data performance disproportionately benefits from a reduced number of classes. Our benchmark also indicates that at low numbers of samples per class, label overfitting effects start to dominate GAN training. We conducted a reader study in which trained radiologists discriminate between synthetic and real images. In accordance with our benchmark results, the classification accuracy of radiologists improves with an increasing resolution. Our study offers valuable guidelines and outlines practical conditions under which insights derived from synthetic images are similar to those that would have been derived from real data. Our results indicate that synthetic data sharing may be an attractive alternative to sharing real patient-level data in the right setting.
HypothesisObesity is one of the main drivers of type 2 diabetes (T2D), but not uniformly associated with the disease. The location of fat accumulation is critical for metabolic health. Specific patterns of body fat distribution such as visceral fat, are closely related to insulin resistance. There might be further, hitherto unknown features of body fat distribution which could additionally contribute to the disease. MethodsWe used machine learning with dense convolutional neural networks (DCNN) to detect diabetes related variables from 2,371 T1-weighted whole-body magnetic resonance imaging (MRI) datasets. MRI was performed in participants undergoing metabolic screening with oral glucose tolerance tests. Models were trained for sex, age, BMI, insulin sensitivity, HbA1c and prediabetes or incident diabetes. The results were compared to conventional models. ResultsThe Area Under the Receiver Operator Characteristic curve was 87% for the T2D discrimination and 68% for prediabetes, both superior to conventional models. Mean absolute regression errors were comparable to conventional models. Heatmaps showed that lower visceral abdominal regions were critical in diabetes classification.Subphenotyping revealed a group with high future diabetes and microalbuminuria risk. InterpretationOur results show that diabetes is detectable from whole-body MRI without additional data. Our technique of heatmap visualization unravels plausible anatomical regions and highlights the leading role of fat accumulation in the lower abdomen in diabetes pathogenesis.
BACKGROUND COVID-19 is a rapidly emerging respiratory disease caused by SARS-CoV-2. Due to the rapid human-to-human transmission of SARS-CoV-2, many health care systems are at risk of exceeding their health care capacities, in particular in terms of SARS-CoV-2 tests, hospital and intensive care unit (ICU) beds, and mechanical ventilators. Predictive algorithms could potentially ease the strain on health care systems by identifying those who are most likely to receive a positive SARS-CoV-2 test, be hospitalized, or admitted to the ICU. OBJECTIVE The aim of this study is to develop, study, and evaluate clinical predictive models that estimate, using machine learning and based on routinely collected clinical data, which patients are likely to receive a positive SARS-CoV-2 test or require hospitalization or intensive care. METHODS Using a systematic approach to model development and optimization, we trained and compared various types of machine learning models, including logistic regression, neural networks, support vector machines, random forests, and gradient boosting. To evaluate the developed models, we performed a retrospective evaluation on demographic, clinical, and blood analysis data from a cohort of 5644 patients. In addition, we determined which clinical features were predictive to what degree for each of the aforementioned clinical tasks using causal explanations. RESULTS Our experimental results indicate that our predictive models identified patients that test positive for SARS-CoV-2 a priori at a sensitivity of 75% (95% CI 67%-81%) and a specificity of 49% (95% CI 46%-51%), patients who are SARS-CoV-2 positive that require hospitalization with 0.92 area under the receiver operator characteristic curve (AUC; 95% CI 0.81-0.98), and patients who are SARS-CoV-2 positive that require critical care with 0.98 AUC (95% CI 0.95-1.00). CONCLUSIONS Our results indicate that predictive models trained on routinely collected clinical data could be used to predict clinical pathways for COVID-19 and, therefore, help inform care and prioritize resources.
Obesity is one of the main drivers of the globally rising prevalence of type 2 diabetes (T2D). Yet, obesity is not uniformly associated with metabolic consequences. The location of fat accumulation is critical for metabolic health. Specific patterns of body fat distribution, such as an increased ratio of visceral to subcutaneous fat, are closely related to insulin resistance which is crucial in the pathogenesis of T2D. There might be further, hitherto unknown features of body fat distribution which could additionally contribute to the disease. We used a machine learning approach with dense convolutional neural networks (DCNN) to detect diabetes related variables from 2371 T1-weighted whole-body magnetic resonance image (MRI) data sets. Each single measurement was labelled by sex, age, BMI, insulin sensitivity, HbA1c and prediabetes or incident diabetes. The result was compared to conventional models using segmented body fat compartment volumes. Anatomical labels were assigned to locations in the DCNN gradient heatmaps that are critical for discrimination. The AUC-ROC was 0.87 for the discrimination of diabetes and 0.68 for prediabetes. Classification performance was superior to conventional models. Mean absolute regression errors were comparable to those of the conventional models. Heatmaps clearly showed that lower visceral abdominal regions were most critical in diabetes classification, while other significant areas comprised upper legs, arms and the neck region.Our results show that diabetes is detectable from whole-body MRI without any blood glucose measurement. Our technique of heatmap visualization unravels plausible anatomical regions and highlights the leading role of fat accumulation in the lower abdomen in the pathogenesis of T2D. Disclosure R. Wagner: Advisory Panel; Self; Novo Nordisk A/S. Speaker’s Bureau; Self; Novo Nordisk A/S. Other Relationship; Self; Eli Lilly and Company. B. Dietz: None. J. Machann: None. P. Schwab: Employee; Self; Roche Pharma. J.K. Dienes: Advisory Panel; Spouse/Partner; Novo Nordisk A/S. Speaker’s Bureau; Spouse/Partner; Novo Nordisk A/S. Other Relationship; Spouse/Partner; Eli Lilly and Company. S. Reichert: Other Relationship; Self; Lilly Diabetes. A.L. Birkenfeld: None. H. Haering: None. F. Schick: None. N. Stefan: None. M. Heni: Research Support; Self; Boehringer Ingelheim Pharmaceuticals, Inc., Sanofi. Speaker’s Bureau; Self; Novo Nordisk A/S. H. Preissl: None. B. Schölkopf: None. S. Bauer: None. A. Fritsche: None. Funding German Federal Ministry of Education and Research (01GI0925)
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.