OBJECTIVE: Our aim was to apply state-of-the-art machine learning algorithms to predict the risk of future progression to diabetes complications, including diabetic kidney disease (≥30% decline in eGFR) and diabetic retinopathy (mild, moderate or severe).
RESEARCH DESIGN AND METHODS: Using data in a cohort of 537 adults with type 1 diabetes we predicted diabetes complications emerging during a median follow-up of 5.4 years. Prediction models were computed first with clinical risk factors at baseline (17 measures) and then with clinical risk factors and blood-derived metabolomics and lipidomics data (965 molecular features) at baseline. Participants were first classified into two groups: type 1 diabetes stable (n=195) or type 1 diabetes with progression to diabetes complications (n=190). Furthermore, progression of diabetic kidney disease (≥30% decline in eGFR; n=79) and diabetic retinopathy (mild, moderate or severe; n=111) were predicted in two complication-specific models. Models were compared by 5-fold cross-validated area under the receiver operating characteristic (AUROC) curves. The Shapley additive explanations algorithm was used for feature selection and for interpreting the models. Accuracy, precision, recall, and F-score were used to evaluate clinical utility.
RESULTS: During a median follow-up of 5.4 years, 79 (21 %) of the participants (mean+-SD: age 54.8 +- 13.7 years) progressed in diabetic kidney disease and 111 (29 %) of the participants progressed to diabetic retinopathy. The predictive models for diabetic kidney disease progression were highly accurate with clinical risk factors: the accuracy of 0.95 and AUROC of 0.92 (95% CI 0.857;0.995) was achieved, further improved to the accuracy of 0.98 and AUROC of 0.99 (95% CI 0.876;0.997) when omics-based predictors were included. The predictive panel composition was: albuminuria, retinopathy, estimated glomerular filtration rate, hemoglobin A1c, and six metabolites (five identified as ribitol, ribonic acid, myo-inositol, 2,4- and 3,4-dihydroxybutanoic acids).
Models for diabetic retinopathy progression were less predictive with clinical risk predictors at, AUROC of 0.81 (95% CI 0.754;0.958) and with omics included at AUROC of 0.87 (95% CI 0.781;0.996) curve. The final retinopathy-panel included: hemoglobin A1c, albuminuria, mild degree of retinopathy, and seven metabolites, including one ceramide and the 3,4-dihydroxybutanoic acid).
CONCLUSIONS: Here we demonstrate the application of machine learning to effectively predict five-year progression of complications, in particular diabetic kidney disease, using a panel of known clinical risk factors in combination with blood small molecules. Further replication of this machine learning tool in a real-world context or a clinical trial will facilitate its implementation in the clinic.