Objective: In this large multi-institutional study, we aimed to analyze the prognostic power of computed tomography (CT)-based radiomics models in COVID-19 patients. Methods: CT images of 14,339 COVID-19 patients with overall survival outcome were collected from 19 medical centers. Whole lung segmentations were performed automatically using a previously validated deep learning-based model, and regions of interest were further evaluated and modified by a human observer. All images were resampled to an isotropic voxel size, intensities were discretized into 64-binning size, and 105 radiomics features, including shape, intensity, and texture features were extracted from the lung mask. Radiomics features were normalized using Z-score normalization. High-correlated features using Pearson (R2>0.99) were eliminated. We applied the Synthetic Minority Oversampling Technique (SMOT) algorithm in only the training set for different models to overcome unbalance classes. We used 4 feature selection algorithms, namely Analysis of Variance (ANOVA), Kruskal-Wallis (KW), Recursive Feature Elimination (RFE), and Relief. For the classification task, we used seven classifiers, including Logistic Regression (LR), Least Absolute Shrinkage and Selection Operator (LASSO), Linear Discriminant Analysis (LDA), Random Forest (RF), AdaBoost (AB), Naive Bayes (NB), and Multilayer Perceptron (MLP). The models were built and evaluated using training and testing sets, respectively. Specifically, we evaluated the models using 10 different splitting and cross-validation strategies, including different types of test datasets (e.g. non-harmonized vs. ComBat-harmonized datasets). The sensitivity, specificity, and area under the receiver operating characteristic (ROC) curve (AUC) were reported for models evaluation. Results: In the test dataset (4301) consisting of CT and/or RT-PCR positive cases, AUC, sensitivity, and specificity of 0.83(sd:0.01) (CI95%: 0.81-0.85), 0.81, and 0.72, respectively, were obtained by ANOVA feature selector + RF classifier. In RT-PCR-only positive test sets (3644), similar results were achieved, and there was no statistically significant difference. In ComBat harmonized dataset, Relief feature selector + RF classifier resulted in highest performance of AUC, reaching 0.83 (sd:0.01) (CI95%: 0.81-0.85), with sensitivity and specificity of 0.77 and 0.74, respectively. At the same time, ComBat harmonization did not depict statistically significant improvement relevant to non-harmonized dataset. In leave-one-center-out, the combination of ANOVA feature selector and LR classifier resulted in the highest performance of AUC (0.80 (sd:0.084)) with sensitivity and specificity of 0.77 (sd:0.11) and 0.76 (sd: 0.075), respectively. Conclusion: Lung CT radiomics features can be used towards robust prognostic modeling of COVID-19 in large heterogeneous datasets gathered from multiple centers. As such, CT radiomics-based model has significant potential for use in prospective clinical settings towards improved management of COVID-19 patients.
Purpose: To derive and validate an effective radiomics-based model for differentiation of COVID-19 pneumonia from other lung diseases using a very large cohort of patients. Methods: We collected 19 private and 5 public datasets, accumulating to 26,307 individual patient images (15,148 COVID-19; 9,657 with other lung diseases e.g. non-COVID-19 pneumonia, lung cancer, pulmonary embolism; 1502 normal cases). Images were automatically segmented using a validated deep learning (DL) model and the results carefully reviewed. Images were first cropped into lung-only region boxes, then resized to 296 by 216 voxels. Voxel dimensions was resized to 1mm3 followed by 64-bin discretization. The 108 extracted features included shape, first-order histogram and texture features. Univariate analysis was first performed using simple logistic regression. The thresholds were fixed in the training set and then evaluation performed on the test set. False discovery rate (FDR) correction was applied to the p-values. Z-Score normalization was applied to all features. For multivariate analysis, features with high correlation (R2>0.99) were eliminated first using Pearson correlation. We tested 96 different machine learning strategies through cross-combining 4 feature selectors or 8 dimensionality reduction techniques with 8 classifiers. We trained and evaluated our models using 3 different datasets: 1) the entire dataset (26,307 patients: 15,148 COVID-19; 11,159 non-COVID-19); 2) excluding normal patients in non-COVID-19, and including only RT-PCR positive COVID-19 cases in the COVID-19 class (20,697 patients including 12,419 COVID-19, and 8,278 non-COVID-19)); 3) including only non-COVID-19 pneumonia patients and a random sample of COVID-19 patients (5,582 patients: 3,000 COVID-19, and 2,582 non-COVID-19) to provide balanced classes. Subsequently, each of these 3 datasets were randomly split into 70% and 30% for training and testing, respectively. All various steps, including feature preprocessing, feature selection, and classification, were performed separately in each dataset. Classification algorithms were optimized during training using grid search algorithms. The best models were chosen by a one-standard-deviation rule in 10-fold cross-validation and then were evaluated on the test sets. Results: In dataset #1, Relief feature selection and RF classifier combination resulted in the highest performance (Area under the receiver operating characteristic curve (AUC) = 0.99, sensitivity = 0.98, specificity = 0.94, accuracy = 0.96, positive predictive value (PPV) = 0.96, and negative predicted value (NPV) = 0.96). In dataset #2, Recursive Feature Elimination (RFE) feature selection and Random Forest (RF) classifier combination resulted in the highest performance (AUC = 0.99, sensitivity = 0.98, specificity = 0.95, accuracy = 0.97, PPV = 0.96, and NPV = 0.98). In dataset #3, the ANOVA feature selection and RF classifier combination resulted in the highest performance (AUC = 0.98, sensitivity = 0.96, specificity = 0.93, accuracy = 0.94, PPV = 0.93, NPV = 0.96). Conclusion: Radiomic features extracted from entire lung combined with machine learning algorithms can enable very effective, routine diagnosis of COVID-19 pneumonia from CT images without the use of any other diagnostic test.
When faced with a hypervascular mediastinal tumor, mediastinal hemangioma should be taken into consideration. Although it is uncommon, considering this important diagnosis may avoid a possible extensive surgery that is not necessary.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.