Introduction
The study objective was to build a machine learning model to predict incident mild cognitive impairment, Alzheimer's Disease, and related dementias from structured data using administrative and electronic health record sources.
Methods
A cohort of patients (n = 121,907) and controls (n = 5,307,045) was created for modeling using data within 2 years of patient's incident diagnosis date. Additional cohorts 3–8 years removed from index data are used for prediction. Training cohorts were matched on age, gender, index year, and utilization, and fit with a gradient boosting machine, lightGBM.
Results
Incident 2‐year model quality on a held‐out test set had a sensitivity of 47% and area‐under‐the‐curve of 87%. In the 3‐year model, the learned labels achieved 24% (71%), which dropped to 15% (72%) in year 8.
Discussion
The ability of the model to discriminate incident cases of dementia implies that it can be a worthwhile tool to screen patients for trial recruitment and patient management.