Proceedings of the Fourth Workshop on NLP for Internet Freedom: Censorship, Disinformation, and Propaganda 2021
DOI: 10.18653/v1/2021.nlp4if-1.18
|View full text |Cite
|
Sign up to set email alerts
|

Fighting the COVID-19 Infodemic with a Holistic BERT Ensemble

Abstract: This paper describes the TOKOFOU system, an ensemble model for misinformation detection tasks based on six different transformer-based pre-trained encoders, implemented in the context of the COVID-19 Infodemic Shared Task for English. We fine tune each model on each of the task's questions and aggregate their prediction scores using a majority voting approach. TOKOFOU obtains an overall F1 score of 89.7%, ranking first.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 15 publications
0
3
0
Order By: Relevance
“…As can be seen in the results, for the NLP4IF dataset, multilingual models perform better than the monolingual models and the best systems in Bulgarian and English while performing on par in Arabic. Please note that these best systems (Qarqaz et al, 2021;Tziafas et al, 2021) have been trained specifically on those language pairs using language specific natural language processing pipelines, yet the multilingual models outperform them in English and Bulgarian.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…As can be seen in the results, for the NLP4IF dataset, multilingual models perform better than the monolingual models and the best systems in Bulgarian and English while performing on par in Arabic. Please note that these best systems (Qarqaz et al, 2021;Tziafas et al, 2021) have been trained specifically on those language pairs using language specific natural language processing pipelines, yet the multilingual models outperform them in English and Bulgarian.…”
Section: Resultsmentioning
confidence: 99%
“…With the recent popularity gained by embedding and deep learning-based approaches in natural language processing, there was a tendency to use deep neural networks powered by content embeddings to perform false information classification too (Ma et al, 2016). Later, with the introduction of transformers (Devlin et al, 2019;Conneau et al, 2020), there was a tendency to involve large pretrained transformer models also (Uyangodage et al, 2021;Tziafas et al, 2021;Qarqaz et al, 2021). However, all of these models were trained specifically on a single language making them less useful in real scenarios where we need to process multilingual data.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation