Background
This article is objected to explore the value of machine learning algorithm in predicting the risk of renal damage in children with Henoch-Schönlein Purpura, and to construct a predictive model of Henoch-Schönlein Purpura Nephritis in children and analyze the related risk factors of Henoch-Schönlein Purpura Nephritis in children.
Methods
Case data of 288 hospitalized children with Henoch-Schönlein Purpura from November 2018 to October 2021 were collected. The data included 42 indicators such as demographic characteristics, clinical symptoms and laboratory tests, etc. Univariate feature selection was used for feature extraction, and Logistic regression, support vector machine, decision tree and random forest algorithm were used respectively for classification prediction. Last, the performance of four algorithms are compared using accuracy rate and recall rate.
Results
The accuracy rate, recall rate and AUC of the established random forest model were 0.83, 0.86 and 0.91 respectively, which were higher than 0.74, 0.80 and 0.89 of the Logistic regression model; higher than 0.70, 0.80 and 0.89 of support vector machine model; higher than 0.74, 0.80 and 0.81 of the decision tree model. The top 10 important features provided by random forest model are Persistent purpura≥4weeks, Cr, Clinic time, ALB, WBC, TC, TG, Relapse, TG, Recurrent purpura and EB-DNA.
Conclusion
The model based on random forest algorithm has better performance in the prediction of children with allergic purpura renal damage, indicated by better classification accuracy, better classification effect and better generalization performance.