Objective
This study aimed to construct a coronary heart disease (CHD) risk-prediction model in people living with human immunodeficiency virus (PLHIV) with the help of machine learning (ML) per electronic medical records (EMRs).
Methods
Sixty-one medical characteristics (including demography information, laboratory measurements, and complicating disease) readily available from EMRs were retained for clinical analysis. These characteristics further aided the development of prediction models by using seven ML algorithms [light gradient-boosting machine (LightGBM), support vector machine (SVM), eXtreme gradient boosting (XGBoost), adaptive boosting (AdaBoost), decision tree, multilayer perceptron (MLP), and logistic regression]. The performance of this model was assessed using the area under the receiver operating characteristic curve (AUC). Shapley additive explanation (SHAP) was further applied to interpret the findings of the best-performing model.
Results
The LightGBM model exhibited the highest AUC (0.849; 95% CI, 0.814–0.883). Additionally, the SHAP plot per the LightGBM depicted that age, heart failure, hypertension, glucose, serum creatinine, indirect bilirubin, serum uric acid, and amylase can help identify PLHIV who were at a high or low risk of developing CHD.
Conclusion
This study developed a CHD risk prediction model for PLHIV utilizing ML techniques and EMR data. The LightGBM model exhibited improved comprehensive performance and thus had higher reliability in assessing the risk predictors of CHD. Hence, it can potentially facilitate the development of clinical management techniques for PLHIV care in the era of EMRs.