Background
Missing data are common in alcohol clinical trials for both continuous and binary endpoints. Approaches to handle missing data have been explored for continuous outcomes, yet no studies have compared missing data approaches for binary outcomes (e.g., abstinence, no heavy drinking days). The present study compares approaches to modeling binary outcomes with missing data in the COMBINE study.
Method
We included participants in the COMBINE Study who had complete drinking data during treatment and who were assigned to active medication or placebo conditions (N=1146). Using simulation methods, missing data were introduced under common scenarios with varying sample sizes and amounts of missing data. Logistic regression was used to estimate the effect of naltrexone (vs. placebo) in predicting any drinking and any heavy drinking outcomes at the end of treatment using four analytic approaches: complete case analysis (CCA), last observation carried forward (LOCF), the worst-case scenario of missing equals any drinking or heavy drinking (WCS), and multiple imputation (MI). In separate analyses, these approaches were compared when drinking data were manually deleted for those participants who discontinued treatment but continued to provide drinking data.
Results
WCS produced the greatest amount of bias in treatment effect estimates. MI usually yielded less biased estimates than WCS and CCA in the simulated data, and performed considerably better than LOCF when estimating treatment effects among individuals who discontinued treatment.
Conclusions
Missing data can introduce bias in treatment effect estimates in alcohol clinical trials. Researchers should utilize modern missing data methods, including MI, and avoid WCS and CCA when analyzing binary alcohol clinical trial outcomes.