Background
Describe the current epidemic situation of HIV/AIDS complicated with pulmonary tuberculosis (PTB) infection in Kashgar region, analyze the factors affecting HIV/AIDS complicated with PTB infection, and use XGboost model to classify and predict the risk of PTB infection in HIV/AIDS patients, so as to improve the level of protection of residents.
Methods
By collecting the data of HIV/AIDS patients in Kashgar area, the patients were divided into simple HIV/AIDS group and double infection group according to whether they had PTB. The prevalence of patients with double infection was described and the influencing factors of double infection were analyzed. All study subjects were divided into a train set and a test set with a ratio of 8:2. Linear models penalized with the L1 norm was selected as the feature selection method. With XGBoost and logistic regression algorithms, prediction models for the risk of PTB infection in HIV/AIDS patients were constructed. ROC curve, delong test, decision curve and calibration curve were used to evaluate the model effect.
Results
The PTB infection rate among HIV/AIDS patients was 33.6%. Residing in Kashgar, initial CD4 lymphocyte count < 200 cells/mm3, and white blood cell count of 4 ~ 10×10^9/L and > 10×10^9/L were associated with an increased risk of PTB in HIV/AIDS patients, whereas being a worker or farmer and having WHO clinical stage II were protective factors. The logistic regression model achieved an AUC of 0.6962 on the training set and 0.6681 on the test set, while the XGBoost model had an AUC of 0.9027 on the training set and 0.8026 on the test set. The Delong test P-values for the training set and test set were < 0.001 and 0.009, respectively, indicating superior predictive performance of the XGBoost model.
Conclusions
The dual infection rate of HIV/AIDS patients with PTB in Kashgar is high, and timely intervention should be performed for HIV/AIDS patients living in Kashgar with initial CD4 lymphocyte count < 200 cells/mm3, white blood cell count of 4 ~ 10×10^9/L and > 10×10^9/L. Moreover the XGboost model has better predictive efficacy than the logistic regression model, indicating that it can be used to classify and predict PTB infection in HIV/AIDS patients.