2022
DOI: 10.2196/41198
|View full text |Cite
|
Sign up to set email alerts
|

Unmasking the Twitter Discourses on Masks During the COVID-19 Pandemic: User Cluster–Based BERT Topic Modeling Approach

Abstract: Background The COVID-19 pandemic has spotlighted the politicization of public health issues. A public health monitoring tool must be equipped to reveal a public health measure’s political context and guide better interventions. In its current form, infoveillance tends to neglect identity and interest-based users, hence being limited in exposing how public health discourse varies by different political groups. Adopting an algorithmic tool to classify users and their short social media texts might re… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(4 citation statements)
references
References 64 publications
(86 reference statements)
0
4
0
Order By: Relevance
“…Relying on patents as a proxy for innovation might not capture the full spectrum of grassroots and open-source solutions. The NLP method BERT also has its own limitations; it leaves out a nonnegligible portion of the corpus due to incongruent themes, which we reported in this study as outliers 37 . The clusters generated are not perfect, but they suffice for a quick, relatively easy, and comprehensible analysis of large amounts of data.…”
Section: Discussionmentioning
confidence: 81%
“…Relying on patents as a proxy for innovation might not capture the full spectrum of grassroots and open-source solutions. The NLP method BERT also has its own limitations; it leaves out a nonnegligible portion of the corpus due to incongruent themes, which we reported in this study as outliers 37 . The clusters generated are not perfect, but they suffice for a quick, relatively easy, and comprehensible analysis of large amounts of data.…”
Section: Discussionmentioning
confidence: 81%
“…Future work could consider using BERTopic which uses the pretrained Bidirectional Encoder Representations from Transformers (BERT) model as a feature extractor and apply clustering algorithms to identify latent topics in a text repository. BERT-topic also allows for the incorporation of human inputs such as topic labels or domain knowledge [ 48 ]. Our study did not focus on any differences in nutrition-related tweets based on geographies.…”
Section: Discussionmentioning
confidence: 99%
“…The tweets included in the data set were not analyzed for possible bot activity, and bots can also spread misinformation [ 51 ]. However, bot presence was likely low, as retweets were excluded for this study, and bots usually retweet content without tweeting the original content [ 52 ]. An argument put forth against deleting bot tweets in a data set is that it is “artificially manipulating a raw data set,” as bots are naturally found on Twitter [ 53 ].…”
Section: Discussionmentioning
confidence: 99%