Background
Currently, postpartum depression (PPD) screening is mainly based on self‐report symptom‐based assessment, with lack of an objective, integrative tool which identifies women at increased risk, before the emergent of PPD. We developed and validated a machine learning‐based PPD prediction model utilizing electronic health record (EHR) data, and identified novel PPD predictors.
Methods
A nationwide longitudinal cohort that included 214,359 births between January 2008 and December 2015, divided into model training and validation sets, was constructed utilizing Israel largest health maintenance organization's EHR‐database. PPD was defined as new diagnosis of a depressive episode or antidepressant prescription within the first year postpartum. A gradient‐boosted decision tree algorithm was applied to EHR‐derived sociodemographic, clinical, and obstetric features.
Results
Among the birth cohort, 1.9% (n = 4104) met the case definition of new‐onset PPD. In the validation set, the prediction model achieved an area under the curve (AUC) of 0.712 (95% confidence interval, 0.690–0.733), with a sensitivity of 0.349 and a specificity of 0.905 at the 90th percentile risk threshold, identifying PPDs at a rate more than three times higher than the overall set (positive and negative predictive values were 0.074 and 0.985, respectively). The model's strongest predictors included both well‐recognized (e.g., past depression) and less‐recognized (differing patterns of blood tests) PPD risk factors.
Conclusions
Machine learning‐based models incorporating EHR‐derived predictors, could augment symptom‐based screening practice by identifying the high‐risk population at greatest need for preventive intervention, before development of PPD.