We propose a novel framework for determining radiomics feature robustness by considering the effects of both biological and noise signals. This framework is preliminarily tested in a study predicting the epidermal growth factor receptor (EGFR) mutation status in non-small cell lung cancer (NSCLC) patients. Pairs of CT images (baseline, 3-week post therapy) of 46 NSCLC patients with known EGFR mutation status were collected and a FDA-customized anthropomorphic thoracic phantom was scanned on two vendors’ scanners at four different tube currents. Delta radiomics features were extracted from the NSCLC patient CTs and reproducible, non-redundant, and informative features were identified. The feature value differences between EGFR mutant and EGFR wildtype patients were quantitatively measured as the biological signal. Similarly, radiomics features were extracted from the phantom CTs. A pairwise comparison between settings resulted in a feature value difference that was quantitatively measured as the noise signal. Biological signals were compared to noise signals at each setting to determine if the distributions were significantly different by two-sample t-test, and thus robust. Four optimal features were selected to predict EGFR mutation status, Tumor-Mass, Sigmoid-Offset-Mean, Gabor-Energy and DWT-Energy, which quantified tumor mass, tumor-parenchyma density transition at boundary, line-like pattern inside tumor and intratumoral heterogeneity, respectively. The first three variables showed robustness across the majority of studied CT acquisition parameters. The textual feature DWT-Energy was less robust. The proposed framework was able to determine robustness of radiomics features at specific settings by comparing biological signal to noise signal. Identification of robust radiomics features may improve the generalizability of radiomics models in future studies.
Background Predictive models utilizing social determinants of health (SDH), demographic data, and local weather data were trained to predict missed imaging appointments (MIA) among breast imaging patients at the Boston Medical Center (BMC). Patients were characterized by many different variables, including social needs, demographics, imaging utilization, appointment features, and weather conditions on the date of the appointment. Methods This HIPAA compliant retrospective cohort study was IRB approved. Informed consent was waived. After data preprocessing steps, the dataset contained 9,970 patients and 36,606 appointments from 1/1/2015 to 12/31/2019. We identified 57 potentially impactful variables used in the initial prediction model and assessed each patient for MIA. We then developed a parsimonious model via recursive feature elimination, which identified the 25 most predictive variables. We utilized linear and non-linear models including support vector machines (SVM), logistic regression (LR), and random forest (RF) to predict MIA and compared their performance. Results The highest-performing full model is the nonlinear RF, achieving the highest Area Under the ROC Curve (AUC) of 76% and average F1 score of 85%. Models limited to the most predictive variables were able to attain AUC and F1 scores comparable to models with all variables included. The variables most predictive of missed appointments included timing, prior appointment history, referral department of origin, and socioeconomic factors such as household income and access to caregiving services. Conclusions Prediction of MIA with the data available is inherently limited by the complex, multifactorial nature of MIA. However, the algorithms presented achieved acceptable performance and demonstrated that socioeconomic factors were useful predictors of MIA. In contrast with non-modifiable demographic factors, we can address SDH to decrease the incidence of MIA.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.