Stroke is a serious disease that has a significant impact on the quality of life and safety of patients. Accurately predicting stroke risk is of great significance for preventing and treating stroke. In the past few years, machine learning methods have shown potential in predicting stroke risk. However, due to the imbalance of stroke data and the challenges of feature selection and model selection, stroke risk prediction still faces some difficulties.This article aims to compare the performance differences between different sampling algorithms and machine learning methods in stroke risk prediction. This study used the over-sampling algorithm (Random Over Sampling and SMOTE), the under-sampling algorithm (Random Under Sampling and ENN), and the hybrid sampling algorithm (SMOTE-ENN), and combined them with common machine learning methods such as K-Nearest Neighbors, Logistic Regression, Decision Tree and Support Vector Machine to build the prediction model.Through the analysis of experimental results, and found that the SMOTE combined with the LR model showed good performance in stroke risk prediction, with a high F1 score. In addition, this study found that the overall performance of the undersampling algorithm is better than that of the oversampling and hybrid sampling algorithms.These research results provide useful references for predicting stroke risk and provide a foundation for further research and application. Future research can continue to explore more sampling algorithms, machine learning methods, and feature engineering techniques to further improve the accuracy and interpretability of stroke risk prediction and promote its application in clinical practice.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.