Stroke is a high-risk neurological condition caused by blockages or
bleeding in the brain, leading to death or disability. This study
proposes a model to address the imbalance in limited patient data.The
proposed model uses the MissForest method, a Random Forest Regression
algorithm, to complete missing data and an artificial immune system
algorithm whose parameters are updated using the Firefly algorithm to
stabilize the data. The One-Sided Selection model is used to improve the
performance of the minority class.The model was evaluated in two
experiments, one using all features and the other selecting the best
features using the Artificial Bee Colony (ABC) algorithm. The models
were trained using six different classification algorithms: CatBoost,
Light Gradient Boosting Machine (LightGBMBoost), Gradient Boosting (GB),
Extreme Gradient Boosting (XGBoost), Support Vector Machine (SVM), and
Logistic Regression (LR). The results were presented using performance
metrics. When trained using all features, the model achieved an accuracy
of 77%, specificity of 44%, and sensitivity of 77%. When trained
using the best features selected by the ABC algorithm, the model
achieved an accuracy of 81%, specificity of 61%, and sensitivity of
81%. Compared to previous studies, the proposed model was effective in
both experiments.