Objective
This study aimed to develop and validate a machine learning algorithm-based model for predicting invasive Klebsiella pneumoniae liver abscess syndrome(IKPLAS) in diabetes mellitus and compare the performance of different models.
Methods
The clinical signs and data on the admission of 213 diabetic patients with Klebsiella pneumoniae liver abscesses were collected as variables. The optimal feature variables were screened out, and then Artificial Neural Network, Support Vector Machine, Logistic Regression, Random Forest, K-Nearest Neighbor, Decision Tree, and XGBoost models were established. Finally, the model's prediction performance was evaluated by the ROC curve, sensitivity (recall), specificity, accuracy, precision, F1-score, Average Precision, calibration curve, and DCA curve.
Results
Four features of hemoglobin, platelet, D-dimer, and SOFA score were screened by the recursive elimination method, and seven prediction models were established based on these variables. The AUC (0.969), F1-Score(0.737), Sensitivity(0.875) and AP(0.890) of the SVM model were the highest among the seven models. The KNN model showed the highest specificity (1.000). Except that the XGB and DT models over-estimates the occurrence of IKPLAS risk, the other models' calibration curves are a good fit with the actual observed results. Decision Curve Analysis showed that when the risk threshold was between 0.4 and 0.8, the net rate of intervention of the SVM model was significantly higher than that of other models. In the feature importance ranking, the SOFA score impacted the model significantly.
Conclusion
An effective prediction model of invasion Klebsiella pneumoniae liver abscess syndrome in diabetes mellitus could be established by a machine learning algorithm, which had potential application value.