2022
DOI: 10.2196/37831
|View full text |Cite
|
Sign up to set email alerts
|

An Analysis of French-Language Tweets About COVID-19 Vaccines: Supervised Learning Approach

Abstract: Background As the COVID-19 pandemic progressed, disinformation, fake news, and conspiracy theories spread through many parts of society. However, the disinformation spreading through social media is, according to the literature, one of the causes of increased COVID-19 vaccine hesitancy. In this context, the analysis of social media posts is particularly important, but the large amount of data exchanged on social media platforms requires specific methods. This is why machine learning and natural lan… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 10 publications
(13 citation statements)
references
References 25 publications
1
6
0
Order By: Relevance
“…In previous work, several custom algorithms were proposed, such as (1) deep learning with a CamemBERT model [ 28 ], BERT [ 46 , 47 ], RoBERTa [ 48 ], FastText [ 49 ], convolutional neural network–long short-term memory with word2vec embeddings [ 50 ] or (2) machine learning with naïve Bayes [ 51 ] or decision tree [ 52 ] models. Off-the-shelf sentiment analysis models include Amazon Web Services Comprehend sentiment analysis [ 53 ] and VADER [ 54 - 60 ], which is a Python lexicon and rule-based sentiment analysis tool [ 43 ].…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…In previous work, several custom algorithms were proposed, such as (1) deep learning with a CamemBERT model [ 28 ], BERT [ 46 , 47 ], RoBERTa [ 48 ], FastText [ 49 ], convolutional neural network–long short-term memory with word2vec embeddings [ 50 ] or (2) machine learning with naïve Bayes [ 51 ] or decision tree [ 52 ] models. Off-the-shelf sentiment analysis models include Amazon Web Services Comprehend sentiment analysis [ 53 ] and VADER [ 54 - 60 ], which is a Python lexicon and rule-based sentiment analysis tool [ 43 ].…”
Section: Discussionmentioning
confidence: 99%
“…Mønsted and Lehmann [ 49 ] obtained a micro-averaged F 1 -score of 0.762 for a 3-class prediction, which is a good result considering that the authors used Amazon’s Mechanical Turk platform. Sauvayre et al [ 28 ] classified tweets according to the opinion for or against users and obtained an accuracy of 0.706 using a fine-tuned CamemBERT prediction model with 2 classes, whereas we used 3 classes, which is more unfavorable. Kummervold et al [ 46 ] obtained an F 1 -score of 0.78 to predict the attitude of pregnant women toward vaccination against COVID-19, but the categories for the classification were different.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…The obtained information, after manually analyzed, if presented through a web platform, could further aid in raising awareness regarding valid information, fake news content, as well as irrelevant information related to vaccines and shared through Twitter platform ( 20 , 34 ). With regards to the machine learning validation results obtained in other studies, a relevant comparison with the ones from the current studies would be difficult, since the majority of the studies which used vaccine related Twitter content reported the F1 Score as the most important classification evaluation metric ( 12 17 , 37 , 47 , 48 ). The F1 Score was computed in the current study as well and can be regarded as an acceptable balanced measure between precision and recall.…”
Section: Discussionmentioning
confidence: 94%
“…That is the main reason for which the focus in the current study, when evaluating the classification performance of the 6 algorithms, was put on the Matthews Correlation Coefficient ( 36 , 37 ). Moreover, the reported F1 Scores showed a high degree of variability, some studies reported F1 Scores of under 0.6, while others reported enhanced values, of 0.7–0.8 and others obtained almost perfect values, of over 0.95, while the implemented machine learning algorithms included both classical (such as Random Forest and SVM) and newer model types (such as deep learning and BERT) applied on various languages, such as English, French, Dutch or Moroccan ( 12 17 , 47 , 48 ). As a comparison, the maximum F1 Score which was obtained in the current study SVM ranged from 0.542 (internal period validation – SVM + MLP Ensemble) to 0.658 (cross-validation – BERT) and 0.655 (external validation – SVM).…”
Section: Discussionmentioning
confidence: 99%