Background
In recent years, both suicide and overdose rates have been increasing. Many individuals who struggle with opioid use disorder are prone to suicidal ideation; this may often result in overdose. However, these fatal overdoses are difficult to classify as intentional or unintentional. Intentional overdose is difficult to detect, partially due to the lack of predictors and social stigmas that push individuals away from seeking help. These individuals may instead use web-based means to articulate their concerns.
Objective
This study aimed to extract posts of suicidality among opioid users on Reddit using machine learning methods. The performance of the models is derivative of the data purity, and the results will help us to better understand the rationale of these users, providing new insights into individuals who are part of the opioid epidemic.
Methods
Reddit posts between June 2017 and June 2018 were collected from r/suicidewatch, r/depression, a set of opioid-related subreddits, and a control subreddit set. We first classified suicidal versus nonsuicidal languages and then classified users with opioid usage versus those without opioid usage. Several traditional baselines and neural network (NN) text classifiers were trained using subreddit names as the labels and combinations of semantic inputs. We then attempted to extract out-of-sample data belonging to the intersection of suicide ideation and opioid abuse. Amazon Mechanical Turk was used to provide labels for the out-of-sample data.
Results
Classification results were at least 90% across all models for at least one combination of input; the best classifier was convolutional neural network, which obtained an F1 score of 96.6%. When predicting out-of-sample data for posts containing both suicidal ideation and signs of opioid addiction, NN classifiers produced more false positives and traditional methods produced more false negatives, which is less desirable for predicting suicidal sentiments.
Conclusions
Opioid abuse is linked to the risk of unintentional overdose and suicide risk. Social media platforms such as Reddit contain metadata that can aid machine learning and provide information at a personal level that cannot be obtained elsewhere. We demonstrate that it is possible to use NNs as a tool to predict an out-of-sample target with a model built from data sets labeled by characteristics we wish to distinguish in the out-of-sample target.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.