Background Digital just-in-time adaptive interventions can reduce binge-drinking events (BDEs; consuming ≥4 drinks for women and ≥5 drinks for men per occasion) in young adults but need to be optimized for timing and content. Delivering just-in-time support messages in the hours prior to BDEs could improve intervention impact. Objective We aimed to determine the feasibility of developing a machine learning (ML) model to accurately predict future, that is, same-day BDEs 1 to 6 hours prior BDEs, using smartphone sensor data and to identify the most informative phone sensor features associated with BDEs on weekends and weekdays to determine the key features that explain prediction model performance. Methods We collected phone sensor data from 75 young adults (aged 21 to 25 years; mean 22.4, SD 1.9 years) with risky drinking behavior who reported their drinking behavior over 14 weeks. The participants in this secondary analysis were enrolled in a clinical trial. We developed ML models testing different algorithms (eg, extreme gradient boosting [XGBoost] and decision tree) to predict same-day BDEs (vs low-risk drinking events and non-drinking periods) using smartphone sensor data (eg, accelerometer and GPS). We tested various “prediction distance” time windows (more proximal: 1 hour; distant: 6 hours) from drinking onset. We also tested various analysis time windows (ie, the amount of data to be analyzed), ranging from 1 to 12 hours prior to drinking onset, because this determines the amount of data that needs to be stored on the phone to compute the model. Explainable artificial intelligence was used to explore interactions among the most informative phone sensor features contributing to the prediction of BDEs. Results The XGBoost model performed the best in predicting imminent same-day BDEs, with 95% accuracy on weekends and 94.3% accuracy on weekdays (F1-score=0.95 and 0.94, respectively). This XGBoost model needed 12 and 9 hours of phone sensor data at 3- and 6-hour prediction distance from the onset of drinking on weekends and weekdays, respectively, prior to predicting same-day BDEs. The most informative phone sensor features for BDE prediction were time (eg, time of day) and GPS-derived features, such as the radius of gyration (an indicator of travel). Interactions among key features (eg, time of day and GPS-derived features) contributed to the prediction of same-day BDEs. Conclusions We demonstrated the feasibility and potential use of smartphone sensor data and ML for accurately predicting imminent (same-day) BDEs in young adults. The prediction model provides “windows of opportunity,” and with the adoption of explainable artificial intelligence, we identified “key contributing features” to trigger just-in-time adaptive intervention prior to the onset of BDEs, which has the potential to reduce the likelihood of BDEs in young adults. Trial Registration ClinicalTrials.gov NCT02918565; https://clinicaltrials.gov/ct2/show/NCT02918565
BACKGROUND Digital behavioral interventions can reduce binge drinking events (BDEs: consuming 4+/5+ drinks per occasion for women/men) in young adults, however, they may not be optimized for timing or content. Delivering support in the hours prior to a predicted drinking event could improve the impact of that support. OBJECTIVE In this paper, our goal is to explore the feasibility of predicting future, that is, same-day, BDEs using smartphone sensor data passively collected prior to the onset of drinking occasions. We also aim to identify the sensor features and associated behavior patterns related to drinking event planning that contribute most to predicting BDEs on weekend and weekdays, respectively. METHODS We collected usable phone sensor data from 75 young adults (ages 21-25; mean =22.4 (SD=1.9) drinkers who self-reported drinking behavior for up to 14 weeks. We developed a machine learning model to predict BDEs (versus non-drinking events and low-risk drinking events (1-3 or 1-4 drinks per occasion for females and males, respectively) using smartphone sensor data (e.g., accelerometer, location). We tested various "prediction distance" time windows (more proximal: 1-hour; to distant: 6-hour) from the onset of drinking. We also tested various analysis time windows (i.e., amount of data to be analyzed), ranging from 1 to 12 hours prior to the onset of drinking, because this determines the amount of data that needs to be stored on the phone to compute the model. RESULTS The best performing machine learning model using 3-, and 6-hours of phone sensor data at a 1-hour distance prior to the event predicted same-day BDEs with an accuracy of 91.4% on weekends and 91.3% for weekdays. Some of the most important phone sensor-based behavioral markers contributing to model accuracy were latitude and longitude of GPS locations, power of movement (i.e., the absolute value of the change in velocity of body movement), radius of gyration (an indicator of travel), battery charge level, and smartphone interaction. CONCLUSIONS We demonstrated the potential use of smartphone sensor data and machine learning to accurately predict future binge drinking events, identifying “windows of opportunity” to trigger Just-In Time interventions prior to the onset of BDEs, with the promise of reducing the likelihood of BDEs in young adults.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.