Background: Pulmonary tuberculosis is a prevalent chronic disease associated with a significant economic burden on patients. Predicting costs can help rationalize the cost structure and manage expenses efficiently. Traditional models have limited predictive performance, but machine learning and big data analysis have shown promise in predicting hospitalization costs. By utilizing accurately predicting costs, medical resources could be allocated more effectively, thereby leading to better control of patient hospitalization costs.
Methods: This study selected data from the information system of a pulmonary hospital in the Kashgar between 2020 and 2022. A total of 9570 eligible pulmonary tuberculosis patients were included in the study. Multiple regression and Multilayer Perceptron (MLP) prediction models were developed using SPSS 26.0 and Python 3.7, respectively. The training set included data from 2020 and 2021, while the test set comprised data from 2022. The models predicted seven costs related to pulmonary tuberculosis patients, including diagnostic cost, medical service cost, material cost, treatment cost, drug cost, other cost, and total hospitalization cost. The model's predictive performance was evaluated using R-square, Root Mean Squared Error, and Mean Absolute Error metrics.
Results: Among the 9570 pulmonary tuberculosis cases included in the study, the median and quartile of patient age are 67.00 (55.00, 74.00) years old; The median and quartile length of hospital stay for patients are 14.00 (11.00, 21.00) days, and the median and quartile of total hospitalization expenses are 13150.45 (9891.3419648.48) yuan. Material cost accounted for the highest proportion (30.25%) in the composition of total hospitalization cost, followed by drug cost (24.01%). Nine factors, including age, marital status, admission condition, length of hospital stay, initial treatment, presence of other diseases, transfer, drug resistance, and admission department, significantly influenced hospitalization costs for pulmonary tuberculosis patients. Excluding other cost, in both the training and test sets, the MLP model demonstrated higher R2 and lower RMSE and MAE results.
Conclusion: In conclusion, MLP can effectively leverage patient information and accurately predict various hospitalization costs, achieving a rationalized structure of hospitalization costs by adjusting higher-cost inpatient items and balancing different cost categories. The insights from this predictive model also hold relevance for research in other medical conditions.