IMPORTANCEOpioid overdose is a leading public health problem in the United States; however, national data on overdose deaths are delayed by several months or more. OBJECTIVES To build and validate a statistical model for estimating national opioid overdose deaths in near real time. DESIGN, SETTING, AND PARTICIPANTS In this cross-sectional study, signals from 5 overdoserelated, proxy data sources encompassing health, law enforcement, and online data from 2014 to 2019 in the US were combined using a LASSO (least absolute shrinkage and selection operator) regression model, and weekly predictions of opioid overdose deaths were made for 2018 and 2019 to validate model performance. Results were also compared with those from a baseline SARIMA (seasonal autoregressive integrated moving average) model, one of the most used approaches to forecasting injury mortality. EXPOSURES Time series data from 2014 to 2019 on emergency department visits for opioid overdose from the National Syndromic Surveillance Program, data on the volume of heroin and synthetic opioids circulating in illicit markets via the National Forensic Laboratory Information System, data on the search volume for heroin and synthetic opioids on Google, and data on post volume on heroin and synthetic opioids on Twitter and Reddit were used to train and validate prediction models of opioid overdose deaths. MAIN OUTCOMES AND MEASURES Model-based predictions of weekly opioid overdose deaths in the United States were made for 2018 and 2019 and compared with actual observed opioid overdose deaths from the National Vital Statistics System.
RESULTSStatistical models using the 5 real-time proxy data sources estimated the national opioid overdose death rate for 2018 and 2019 with an error of 1.01% and −1.05%, respectively. When considering the accuracy of weekly predictions, the machine learning-based approach possessed a mean error in its weekly estimates (root mean squared error) of 60.3 overdose deaths for 2018 (compared with 310.2 overdose deaths for the SARIMA model) and 67.2 overdose deaths for 2019 (compared with 83.3 overdose deaths for the SARIMA model).
CONCLUSIONS AND RELEVANCEResults of this serial cross-sectional study suggest that proxy administrative data sources can be used to estimate national opioid overdose mortality trends to provide a more timely understanding of this public health problem.