Objective:
To determine whether information in medical and pharmacy claims data can predict, at the time of prescribing the first antiepileptic drug (AED), which patients with epilepsy will become resistant to AEDs.
Method:
We analyzed longitudinal claims data from 1,376,756 patients with epilepsy from 2006 to 2015. Of these, 582,258 satisfied all inclusion criteria; 49,916 were ultimately AED resistant, operationally defined as a patient with claims filed for at least 4 distinct AEDs. We constructed 1,270 candidate predictors (“features”) reflecting demographics, comorbidities, medications, procedures, epilepsy status, and payer status to characterize the cohort. On the training dataset (528,640 patients) we performed ANOVA F-value tests to select predictive features and trained several prediction algorithms, including logistic regression, support vector machines (SVM), and random forests. A model with only age and gender was used as a benchmark model.
Results:
On a held-out test set (53,618 patients), the best model achieves an area under the receiver operating characteristic (ROC) curve (AUC) [95% CI] of 0.753 [0.747, 0.759], compared to 0.664 [0.658, 0.671] for the benchmark model. Moreover, predicted probabilities for drug resistance closely match the observed frequencies. Compared to waiting for 2 AED failures, our model predicts drug resistance on average 2.25 years earlier.
Conclusion:
Predictive models created from large claims data using machine learning methods can accurately predict which patients with epilepsy will prove drug resistant at the time of prescribing the first AED. The ability to predict refractoriness may help patients consider alternative therapies earlier in the course of their epilepsy.