Purpose
De‐implementation of low‐value services among patients with limited life expectancy is challenging. Robust mortality prediction models using routinely collected health care data can enhance health care stakeholders' ability to identify populations with limited life expectancy. We developed and validated a claims‐based prediction model for 5‐year mortality using regularized regression methods.
Methods
Medicare beneficiaries age 66 or older with an office visit and at least 12 months of pre‐visit continuous Medicare A/B enrollment were identified in 2008. Five‐year mortality was assessed through 2013. Secondary outcomes included 30‐, 90‐, and 180‐day and 1‐year mortality. Claims‐based predictors, including comorbidities and indicators of disability, frailty, and functional impairment, were selected using regularized logistic regression, applying the least absolute shrinkage and selection operator (LASSO) in a random 80% training sample. Model performance was assessed and compared with the Gagne comorbidity score in the 20% validation sample.
Results
Overall, 183 204 (24%) individuals died. In addition to demographics, 161 indicators of comorbidity and function were included in the final model. In the validation sample, the c‐statistic was 0.825 (0.823‐0.828). Median‐predicted probability of 5‐year mortality was 14%; almost 4% of the cohort had a predicted probability greater than 80%. Compared with the Gagne score, the LASSO model led to improved 5‐year mortality classification (net reclassification index = 9.9%; integrated discrimination index = 5.2%).
Conclusions
Our claims‐based model predicting 5‐year mortality showed excellent discrimination and calibration, similar to the Gagne score model, but resulted in improved mortality classification. Regularized regression is a feasible approach for developing prediction tools that could enhance health care research and evaluation of care quality.