Purpose
Diagnosing osteoporosis in T2DM based on bone mineral density (BMD) remains challenging. We sought to develop prediction models employing machine learning algorithms for use as screening instruments for osteoporosis in T2DM patients.
Patients and Methods
Data were collected from 433 participants and analyzed using nine categorical machine learning algorithms to select features based on demographic and clinical variables. Multiple classification models were compared using the area under the receiver operating characteristic curve (ROC-AUC), accuracy, sensitivity, specificity, the average precision (AP), precision, F1 score, precision-recall curves, calibration plots, and decision curve analysis (DCA) to determine the best model. In addition, 5-fold cross-validation was utilized to optimize the model, followed by an evaluation of feature significance using Shapley Additive exPlanations (SHAP). Using latent class analysis (LCA), distinct subpopulations were identified by constructing several discrete clusters.
Results
In this study, nine feature variables were identified to construct predictive models for osteoporosis in individuals with T2DM. The machine learning algorithms achieved an AP range of 0.444–1.000. The XGBoost model was selected as the final prediction model with an AUROC of 0.940 in the training set, 0.772 in the validation set for 5-fold cross-validation, and 0.872 in the test set. Using SHAP methodology, 25(OH)D was identified as the most important risk factor. Additionally, a 3-Class model was constructed using LCA, which categorized individuals into high, medium, and low-risk groups.
Conclusion
Our study developed a predictive model with high accuracy and clinical validity for predicting osteoporosis in type 2 diabetes patients. We also identified three subpopulations with varying osteoporosis risk using clustering. However, limited sample size warrants cautious interpretation of results, and validation in larger cohorts is needed.