Social media's unfettered access has made it an important venue for health discussion and a resource for patients and their loved ones. However, the quality of the information available, as well as the motivations of its posters, has been questioned. This work examines the individuals on social media that are posting questionable health-related information, and in particular promoting cancer treatments which have been shown to be ineffective (making it a kind of misinformation, willful or not). Using a multi-stage user selection process, we study 4,212 Twitter users who have posted about one of 139 such "treatments", and compare them to a baseline of users generally interested in cancer. Considering features capturing user attributes, writing style, and sentiment, we build a classifier which is able to identify users prone to propagate such misinformation at an accuracy of over 90%, providing a potential tool for public health officials to identify such individuals for preventive intervention.
A. Ghenai & Y. Mejovaimposed on often profit-seeking websites, social media provides a dynamic forum for propagating possible medical misinformation [31]. Recent rise in vaccine hesitancy has been linked to an active movement on Twitter, promoting conspiratorial thinking and mistrust in the government [65]. Image sharing platforms such as Flickr and Instagram have become battlegrounds between the pro-anorexia movement and physicians attempting to intervene [13,92]. Uncertainty surrounding infectious disease outbreaks, such as the Zika epidemic of 2016, yielded rumors and speculations about its causes, preventive measures, and consequences [23,34].In this study we turn to the individuals sharing questionable medical information on Twitter, in particular cancer treatments which have been medically proven to be ineffective. Having around 336 million monthly active users in the first quarter of 2018 1 , Twitter is one of the largest social media websites expressly dedicated to the sharing of information, including that on cancer. Compiling hundreds of thousands of tweets on 139 queries spanning acupuncture, cinnamon, reflexology, and vitamin C, we apply strict selective criteria employing human/organization classification [61], name dictionaries, usage thresholds, and crowdsourced relevance refinement resulting in 4,212 users, which we then compare to those mentioning cancer in general from a previous study [70]. Employing previous research on rumor detection, we characterize these users in multi-faceted feature spaces, encompassing user attributes, linguistic style, sentiment, and post timing. We find users who have a more sophisticated language, who are interested in cancer, but who are not personally involved with the illness. We build a logistic regression model which, out of Twitter users mentioning cancer, is able to identify those who will eventually post a piece of misinformation with a high level of accuracy.Misinformation on social media is an urgent issue, and even more so in the health field. This paper is one of th...