Background
Rapidly progressive interstitial lung disease (RP-ILD) is a significant complication that determines the prognosis of dermatomyositis (DM). Early RP-ILD diagnosis can improve screening and diagnostic efficiency and provide meaningful guidance to carry out early and aggressive treatment.
Methods
A retrospective screening of 284 patients with DM was performed. Clinical and laboratory characteristics of the patients were recorded. The risk factors of RP-ILD in DM patients were screened by logistic regression (LR) and machine learning methods, and the prediction models of RP-ILD were developed by machine learning methods, namely least absolute shrinkage and selection operator (LASSO), random forest (RF), and extreme gradient boosting (XGBoost).
Results
According to the result of univariate LR, disease duration is a protective factor for RP-ILD, and ESR, CRP, anti-Ro-52 antibody and anti-MDA5 antibody are risk factors for RP-ILD. The top 10 important variables of the 3 machine learning models were intersected to obtain common important variables, and there were 5 common important variables, namely disease duration, LDH, CRP, anti-Ro-52 antibody and anti-MDA5 antibody. The AUC of LASSO, RF and XGBoost test set were 0.661, 0.667 and 0.867, respectively. We further validated the performance of these three models on validation set, and the results showed that, the AUC of LASSO, RF and XGBoost were 0.764, 0.727 and 0.909, respectively. Based on the results of the models, XGBoost is the optimal model in this study.
Conclusion
Disease duration, LDH, CRP, anti-Ro-52 antibody and anti-MDA5 antibody are vital risk factors for RP-ILD in DM. The prediction model constructed using XGBoost can be used for risk identification and early intervention in DM patients with RP-ILD and practical application.