ObjectiveThe study aimed to use supervised machine learning models to predict the length and risk of prolonged hospitalization in PLWHs to help physicians timely clinical intervention and avoid waste of health resources.MethodsRegression models were established based on RF, KNN, SVM, and XGB to predict the length of hospital stay using RMSE, MAE, MAPE, and R2, while classification models were established based on RF, KNN, SVM, NN, and XGB to predict risk of prolonged hospital stay using accuracy, PPV, NPV, specificity, sensitivity, and kappa, and visualization evaluation based on AUROC, AUPRC, calibration curves and decision curves of all models were used for internally validation.ResultsIn regression models, XGB model performed best in the internal validation (RMSE = 16.81, MAE = 10.39, MAPE = 0.98, R2 = 0.47) to predict the length of hospital stay, while in classification models, NN model presented good fitting and stable features and performed best in testing sets, with excellent accuracy (0.7623), PPV (0.7853), NPV (0.7092), sensitivity (0.8754), specificity (0.5882), and kappa (0.4672), and further visualization evaluation indicated that the largest AUROC (0.9779), AUPRC (0.773) and well-performed calibration curve and decision curve in the internal validation.ConclusionThis study showed that XGB model was effective in predicting the length of hospital stay, while NN model was effective in predicting the risk of prolonged hospitalization in PLWH. Based on predictive models, an intelligent medical prediction system may be developed to effectively predict the length of stay and risk of HIV patients according to their medical records, which helped reduce the waste of healthcare resources.