Purpose: Predicting 30-day readmission risk is paramount to improving the quality of patient care. Previous studies have examined clinical risk factors associated with hospital readmissions. In this study, we compare sets of patient, provider, and community-level variables that are available at two different points of a patient's inpatient encounter (first 48 hours and the full encounter) to train readmission prediction models in order to identify and target appropriate actionable interventions that can potentially reduce avoidable readmissions.
Methods: Using EHR data from a retrospective cohort of 2460 oncology patients, two sets of binary classification models predicting 30-day readmission were developed; one trained on variables that are available within the first 48 hours of admission and another trained on data from the entire hospital encounter. A comprehensive machine learning analysis pipeline was leveraged including preprocessing and feature transformation, feature importance and selection, machine learning modeling, and post-analysis.
Results: Leveraging all features, the LGB (light gradient boosted machines) model produced higher, but comparable performance: (AUC: 0.711 and APS: 0.225) compared to Epic (AUC: 0.697 and APS: 0.221). Given features in the first 48-hours, the random forest model produces higher AUC (0.684), but lower PRC (0.18) and APS (0.184) than the Epic model (AUC: 0.676). In terms of the characteristics of patients flagged by these models, both the full LGB and 48-hour (random forest) feature models were highly sensitive in flagging more patients than the Epic models. Both models flagged patients with a similar distribution of race and sex; however, our LGB and random forest models more inclusive flagging more patients among younger age groups. The Epic models were more sensitive to identifying patients with an average lower zip income. Our 48-hour models were powered by novel features at various levels: patient (weight changeover 365 days, depression symptoms, laboratory values, cancer type), provider (winter discharge, hospital admission type), community (zip income, marital status of partner).
Conclusion: We demonstrated that we could develop and validate models comparable to existing Epic 30-day readmission models, but provide several actionable insights that could create service interventions deployed by the case management or discharge planning teams that may decrease readmission rates over time.