Background Social media allows researchers to study opinions and reactions to events in real time. One area needing more study is anthrax-related events. A computational framework that utilizes machine learning techniques was created to collect tweets discussing anthrax, further categorize them as relevant by the month of data collection, and detect discussions on anthrax-related events. Objective The objective of this study was to detect discussions on anthrax-related events and to determine the relevance of the tweets and topics of discussion over 12 months of data collection. Methods This is an infoveillance study, using tweets in English containing the keyword “Anthrax” and “Bacillus anthracis”, collected from September 25, 2017, through August 15, 2018. Machine learning techniques were used to determine what people were tweeting about anthrax. Data over time was plotted to determine whether an event was detected (a 3-fold spike in tweets). A machine learning classifier was created to categorize tweets by relevance to anthrax. Relevant tweets by month were examined using a topic modeling approach to determine the topics of discussion over time and how these events influence that discussion. Results Over the 12 months of data collection, a total of 204,008 tweets were collected. Logistic regression analysis revealed the best performance for relevance (precision=0.81; recall=0.81; F1-score=0.80). In total, 26 topics were associated with anthrax-related events, tweets that were highly retweeted, natural outbreaks, and news stories. Conclusions This study shows that tweets related to anthrax can be collected and analyzed over time to determine what people are discussing and to detect key anthrax-related events. Future studies are required to focus only on opinion tweets, use the methodology to study other terrorism events, or to monitor for terrorism threats.
BACKGROUND A computational framework that utilizes machine learning methodologies was created to collect tweets discussing anthrax, further categorize them as relevant by month of data collection and detect anthrax related events. OBJECTIVE The objective of this study was to detect anthrax related events and to determine the relevancy of the tweets and topics of discussion over twelve months of data collection. METHODS Machine learning techniques were used to determine what people were tweeting about anthrax. Data over time was graphed to see if an event was detected (a three-fold spike in tweets). A machine learning classifier was created to categorize tweets as relevant. Relevant tweets by month were examined using a topic modeling approach to determine the topics of discussion over time and how events influence that discussion. RESULTS Over the twelve months of data collection 204,008 tweets were collected. Logistic regression performed best for relevancy (precision=0.81, recall=0.81, and F1-score=0.80). Twenty-six topics were found relating to anthrax events, tweets that were highly re-tweeted, natural outbreaks, and news stories. CONCLUSIONS This study demonstrated that tweets relating to anthrax can be collected and analyzed over time to determine what people are discussing and detect key anthrax-related events. Future studies can focus on opinion tweets only, use the methodology to study other terrorism events, or use the methodology to monitor for threats.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.