Background and Purpose: Stroke-related functional risk scores are used to predict patients' functional outcomes following a stroke event. We evaluate the predictive accuracy of machine-learning algorithms for predicting functional outcomes in acute ischemic stroke patients after endovascular treatment. Methods: Data were from the Precise and Rapid Assessment of Collaterals with Multi-phase CT Angiography (PROVE-IT), an observational study of 614 ischemic stroke patients. Regression and machine learning models, including random forest (RF), classification and regression tree (CART), C5.0 decision tree (DT), support vector machine (SVM), adaptive boost machine (ABM), least absolute shrinkage and selection operator (LASSO) logistic regression, and logistic regression models were used to train and predict the 90-day functional impairment risk, which is measured by the modified Rankin scale (mRS) score > 2. The models were internally validated using split-sample cross-validation and externally validated in the INTERRSeCT cohort study. The accuracy of these models was evaluated using the area under the receiver operating characteristic curve (AUC), Matthews Correlation Coefficient (MCC), and Brier score. Results: Of the 614 patients included in the training data, 249 (40.5%) had 90-day functional impairment (i.e., mRS > 2). The median and interquartile range (IQR) of age and baseline NIHSS scores were 77 years (IQR = 69-83) and 17 (IQR = 11-22), respectively. Both logistic regression and machine learning models had comparable predictive accuracy when validated internally (AUC range = [0.65-0.72]; MCC range = [0.29-0.42]) and externally (AUC range = [0.66-0.71]; MCC range = [0.34-0.42]). Conclusions: Machine learning algorithms and logistic regression had comparable predictive accuracy for predicting stroke-related functional impairment in stroke patients.
Background and Purpose: The burden of stroke-related functional impairment remains high among stroke survivors. Clinical prediction models are commonly used to estimate patient functional impairment risk. However, these models have been principally developed based on regression models, which are sensitive to multicollinearity. This study investigates whether there is any advantage in using machine learning models to develop stroke-related functional impairment risk prediction tools. Methods: Using data from a multi-center hospital-based cohort study (n = 614). Modified Rankin Scale (mRS) score was used to assess 90-day functional impairment status. The accuracy of machine learning models was used to predict the risk of patient-specific risk of 90-day functional impairment. Area under the receiver operating characteristic curve (AUC) was used to assess the predictive accuracy of these models via internal cross-validation and external validation in the ESCAPE randomized controlled trial data. Results: Of the 614 patients included in the analyses, 348(56.7%) had some form of functional impairment (i.e., mRS > 1), 313 (50.9%) were males, while the median and interquartile range (IQR) of age and baseline NIHSS scores were 72 years (IQR = 63-80) and 12 (IQR = 6-19), respectively. Internal cross-validation shows that the AUC for regression models were 68.3% (95%CI = [63.9% - 76.5%]) and 70.1% (95%CI = [63.5% - 76.1%]) while the AUC for machine learning models ranged between 62.7% to 68.8%. But when these models were externally validated in the ESCAPE data, the AUC for regression models were 39.6% (95%CI = [36.1% - 47.5%]) and 35.8% (95%CI = [30.4% - 41.5%]) while the AUC for machine learning models ranged between 61.6% (95%CI = [58.2% - 67.3%]) and 66.7% (95%CI = [61.3% - 72.3%]). Conclusions: This study shows that while there were negligible differences between risk prediction models based on machine learning and regression-based models when internally validated, the former are more accurate than the latter in predicting stroke-related functional impairment in externally validated data. Future research will use Monte Carlo methods to develop recommendations for selecting machine learning models under a variety of data characteristics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.