Search citation statements
Paper Sections
Citation Types
Year Published
Publication Types
Relationship
Authors
Journals
Background: Youth experiencing homelessness (YEH) suffer from substance use problems disproportionately compared to other youth. A study found that 69% of YEH meet the criteria for dependence on at least one substance compared to 1.8% of all US adolescents. In addition, they experience major structural and social inequalities which further undermine their ability to get the care they need. Objective:The goal of this study is to develop a machine learning-based framework that utilizes homeless youth's social media content (posts and interactions) to predict their substance use behaviors (i.e., the probability of using certain substances). With this framework, social workers and care providers can identify and reach out to YEH who are at a higher risk of substance use. Methods:We recruited 133 homeless youth at a non-profit organization located in a city in the western United States. After obtaining their consent, we collected types of data: (1) participants' social media conversations for the past year before they were recruited;(2) we asked the participants to complete a survey on their demographic information, health conditions, sexual behaviors, and substance usage behaviors. Building on the social sharing of emotions theory and social support theory, we identified important features that can potentially predict substance use. Then we used natural language processing techniques to extract such features from social media conversations and reactions and built a series of machine learning models to predict participants' marijuana use. Results:We evaluate our models based on their predictive performance as well as their conformity to measures of fairness. Without predictive features from survey information, which may introduce gender and racial biases, our machine-learning models can reach an AUC of 0.74 and an accuracy of 0.77 using social media data only. We also evaluated the false positive rate for each gender and age segmentation. Conclusions:We showed that textual interactions among YEH and their friends on social media can serve as a powerful resource to predict their substance usage. The framework we developed allows care providers to allocate resources efficiently to YEH in the greatest need while costing minimal overhead. It can be extended to analyze and predict other health-related behaviors and conditions observed in this vulnerable community.
Background: Youth experiencing homelessness (YEH) suffer from substance use problems disproportionately compared to other youth. A study found that 69% of YEH meet the criteria for dependence on at least one substance compared to 1.8% of all US adolescents. In addition, they experience major structural and social inequalities which further undermine their ability to get the care they need. Objective:The goal of this study is to develop a machine learning-based framework that utilizes homeless youth's social media content (posts and interactions) to predict their substance use behaviors (i.e., the probability of using certain substances). With this framework, social workers and care providers can identify and reach out to YEH who are at a higher risk of substance use. Methods:We recruited 133 homeless youth at a non-profit organization located in a city in the western United States. After obtaining their consent, we collected types of data: (1) participants' social media conversations for the past year before they were recruited;(2) we asked the participants to complete a survey on their demographic information, health conditions, sexual behaviors, and substance usage behaviors. Building on the social sharing of emotions theory and social support theory, we identified important features that can potentially predict substance use. Then we used natural language processing techniques to extract such features from social media conversations and reactions and built a series of machine learning models to predict participants' marijuana use. Results:We evaluate our models based on their predictive performance as well as their conformity to measures of fairness. Without predictive features from survey information, which may introduce gender and racial biases, our machine-learning models can reach an AUC of 0.74 and an accuracy of 0.77 using social media data only. We also evaluated the false positive rate for each gender and age segmentation. Conclusions:We showed that textual interactions among YEH and their friends on social media can serve as a powerful resource to predict their substance usage. The framework we developed allows care providers to allocate resources efficiently to YEH in the greatest need while costing minimal overhead. It can be extended to analyze and predict other health-related behaviors and conditions observed in this vulnerable community.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.