Background
Discussions of health issues on social media are a crucial information source reflecting real-world responses regarding events and opinions. They are often important in public health care, since these are influencing pathways that affect vaccination decision-making by hesitant individuals. Artificial intelligence methodologies based on internet search engine queries have been suggested to detect disease outbreaks and population behavior. Among social media, Twitter is a common platform of choice to search and share opinions and (mis)information about health care issues, including vaccination and vaccines.
Objective
Our primary objective was to support the design and implementation of future eHealth strategies and interventions on social media to increase the quality of targeted communication campaigns and therefore increase influenza vaccination rates. Our goal was to define an artificial intelligence–based approach to elucidate how threads in Twitter on influenza vaccination changed during the COVID-19 pandemic. Such findings may support adapted vaccination campaigns and could be generalized to other health-related mass communications.
Methods
The study comprised the following 5 stages: (1) collecting tweets from Twitter related to influenza, vaccines, and vaccination in the United States; (2) data cleansing and storage using machine learning techniques; (3) identifying terms, hashtags, and topics related to influenza, vaccines, and vaccination; (4) building a dynamic folksonomy of the previously defined vocabulary (terms and topics) to support the understanding of its trends; and (5) labeling and evaluating the folksonomy.
Results
We collected and analyzed 2,782,720 tweets of 420,617 unique users between December 30, 2019, and April 30, 2021. These tweets were in English, were from the United States, and included at least one of the following terms: “flu,” “influenza,” “vaccination,” “vaccine,” and “vaxx.” We noticed that the prevalence of the terms vaccine and vaccination increased over 2020, and that “flu” and “covid” occurrences were inversely correlated as “flu” disappeared over time from the tweets. By combining word embedding and clustering, we then identified a folksonomy built around the following 3 topics dominating the content of the collected tweets: “health and medicine (biological and clinical aspects),” “protection and responsibility,” and “politics.” By analyzing terms frequently appearing together, we noticed that the tweets were related mainly to COVID-19 pandemic events.
Conclusions
This study focused initially on vaccination against influenza and moved to vaccination against COVID-19. Infoveillance supported by machine learning on Twitter and other social media about topics related to vaccines and vaccination against communicable diseases and their trends can lead to the design of personalized messages encouraging targeted subpopulations’ engagement in vaccination. A greater likelihood that a targeted population receives a personalized message is associated with higher response, engagement, and proactiveness of the target population for the vaccination process.