Introduction
Heart Failure (HF) may induce bowel hypoperfusion, leading to hypoxia of the villa of the bowel wall and the occurrence of Clostridioides difficile infection (CDI). However, the risk factors for the development of CDI in HF patients have yet to be fully illustrated, especially because of a lack of evidence from real-world data.
Methods:
Clinical data and survival situations of HF patients with CDI admitted to ICU were extracted from the Medical Information Mart for Intensive Care (MIMIC)-IV database. For developing a model that can predict 28-day all-cause mortality in HF patients with CDI, the Recursive Feature Elimination with Cross-Validation (RFE-CV) method was used for feature selection. And nine machine learning (ML) algorithms, including logistic regression (LR), decision tree (DT), bayesian, adaptive boosting (AdaBoost), random forest (RF), gradient boosting decision tree (GBDT), XGBoost, light gradient boosting machine (LightGBM), and categorical boosting (CatBoost) were applied for model construction. After training and hyperparameter optimization of the models through grid search 5-fold cross-validation, the performance of models was evaluated by the area under curve (AUC), accuracy, sensitivity, specificity, precision, negative predictive value, and F1 score. Furthermore, the SHapley Additive exPlanations (SHAP) method was used to interpret the optimal model.
Results:
A total of 526 HF patients with CDI were included in the study, of whom 99 cases (18.8%) experienced death within 28 days. 18 of the 57 variables were selected for the model construction algorithm for model construction. Among the ML models considered, the RF model emerged as the optimal model achieving the accuracy, F1-score, and AUC values of 0.821, 0.596, and 0.864 respectively. The net benefit of the model surpassed other models at 16%~22% threshold probabilities based on decision curve analysis. According to the importance of features in the RF model, red blood cell distribution width, blood urea nitrogen, Simplified Acute Physiology Score II, Sequential Organ Failure Assessment, and white blood cell count were highlighted as the five most influential variables.
Conclusions:
We developed ML models to predict 28-day all-cause mortality in HF patients associated with CDI in the ICU, which are more effective than the conventional logistic regression model. The RF model has the best performance among all the ML models employed. It may be useful to help clinicians identify high-risk HF patients with CDI.