2021 IEEE International Conference on Big Data (Big Data) 2021
DOI: 10.1109/bigdata52589.2021.9671970
|View full text |Cite
|
Sign up to set email alerts
|

Transforming Fake News: Robust Generalisable News Classification Using Transformers

Abstract: As online news has become increasingly popular and fake news increasingly prevalent, the ability to audit the veracity of online news content has become more important than ever. Such a task represents a binary classification challenge, for which transformers have achieved state-of-the-art results. Using the publicly available ISOT and Combined Corpus datasets, this study explores transformers' abilities to identify fake news, with particular attention given to investigating generalisation to unseen datasets w… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(6 citation statements)
references
References 34 publications
0
5
0
1
Order By: Relevance
“…This finding motivates the need for more robust models. It also adds to the body of evidence that suggests that current publicly available datasets are simply too small to train generalisable models [4]. This may explain why models trained on the ISOT dataset performed marginally better in terms of generalisability compared to models trained on the smaller datasets, and it further demonstrates the need for larger, well-labelled datasets, such as those hosted by Facebook, to be made more readily available.…”
Section: Discussionmentioning
confidence: 98%
See 2 more Smart Citations
“…This finding motivates the need for more robust models. It also adds to the body of evidence that suggests that current publicly available datasets are simply too small to train generalisable models [4]. This may explain why models trained on the ISOT dataset performed marginally better in terms of generalisability compared to models trained on the smaller datasets, and it further demonstrates the need for larger, well-labelled datasets, such as those hosted by Facebook, to be made more readily available.…”
Section: Discussionmentioning
confidence: 98%
“…This finding is also replicated by a study by [5] which also performed a similar test on similar, if not identical, datasets of different topics and found similar drops in accuracy between models trained on political news and tested on celebrity news and vice versa. Similarly, a study by [4] also explores how well models generalise across different topics by testing across two datasets: the ISOT [1], [2] dataset and the Combined Corpus (CC) [25] dataset. Most of the data contained in the ISOT dataset is political in nature whereas the Combined Corpus covers additional topics such as healthcare, sports and entertainment.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Las investigaciones relacionadas con modelos de clasificación multiclase con Transformers demuestran que los modelos BERT y DistilBERT superan alternativas de aprendizaje profundo en clasificación de noticias falsas [36]. Los estudios comparativos en un clasificador de reseñas falsas implementado con RoBERTa superó a los modelos ALBERT y DistilBERT [37].…”
Section: Trabajos Relacionadosunclassified
“…This comparison is based on a set of performance metrics: accuracy, precision, recall, and F1 score. Linear SVM [34] ---92.00% FakeBERT [35] 98.90% ---deBERTa [36] 97.70% 97.70% 97.70% 98.90%…”
Section: Comparative Performance Analysismentioning
confidence: 99%