Proceedings of the Web Conference 2021 2021
DOI: 10.1145/3442381.3450111
|View full text |Cite
|
Sign up to set email alerts
|

The Surprising Performance of Simple Baselines for Misinformation Detection

Abstract: As social media becomes increasingly prominent in our day to day lives, it is increasingly important to detect informative content and prevent the spread of disinformation and unverified rumours. While many sophisticated and successful models have been proposed in the literature, they are often compared with older NLP baselines such as SVMs, CNNs, and LSTMs. In this paper, we examine the performance of a broad set of modern transformer-based language models and show that with basic fine-tuning, these models ar… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
9
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 39 publications
(9 citation statements)
references
References 94 publications
(147 reference statements)
0
9
0
Order By: Relevance
“…Content-based methods mainly utilize the textual or visual content from the news article and related posts for news verification (Yang et al 2012;Afroz, Brennan, and Greenstadt 2012;Kwon et al 2013;Przybyla 2020;Ma et al 2016;Zellers et al 2019;Qi et al 2019;Gupta et al 2013;Jin et al 2016b;Kaliyar, Goswami, and Narang 2021). These methods enable the detection of fake news at an early stage (Wei et al 2021;Pelrine, Danovitch, and Rabbany 2021). However, their performance is limited as they ignore auxiliary knowledge for news verification.…”
Section: Related Workmentioning
confidence: 99%
“…Content-based methods mainly utilize the textual or visual content from the news article and related posts for news verification (Yang et al 2012;Afroz, Brennan, and Greenstadt 2012;Kwon et al 2013;Przybyla 2020;Ma et al 2016;Zellers et al 2019;Qi et al 2019;Gupta et al 2013;Jin et al 2016b;Kaliyar, Goswami, and Narang 2021). These methods enable the detection of fake news at an early stage (Wei et al 2021;Pelrine, Danovitch, and Rabbany 2021). However, their performance is limited as they ignore auxiliary knowledge for news verification.…”
Section: Related Workmentioning
confidence: 99%
“…Aside from 'myth busters' around COVID-19 and election ballot processing, online users could seek little guidance from the institutions in discerning misinformation. Discerning misinformation is, according to current evidence [23,36,63], not something users are endowed with, and experts for now only provide strategies for inoculation against misinformation [32,38] in addition to the automated detection [31,50,79].…”
Section: Is Misinformation a Wicked Problem?mentioning
confidence: 99%
“…The distribution of participants per their self-reported gender identity was 102 (43.4%) were female, 117 (49.78%) male, and 16 (6.82%) preferred not to say. Age-wise, 32 (12.76%) were in the [18][19][20][21][22][23][24][25][26][27][28][29][30] bracket, 100 (42.55%) in the [31][32][33][34][35][36][37][38][39][40] bracket, 60 (25.53%) in the [41][42][43][44][45][46][47][48][49][50] bracket, 28 (11.91%) in the [51][52][53][54][55][56][57][58][59][60] bracket, and 15 (6.38%) were 61+ old. The distribution of the political leanings within the sample was: apolitical 10 (4.25%), left-leaning 115 (48.93%), moderate 61 (25.95%), and 49 right-leaning (20.85%).…”
Section: Samplementioning
confidence: 99%
“…For instance, [4,9] respectively employ Recurrent Neural Networks and Convolutional Neural Networks to model the variations of text and user representations over time. Hierarchical attention networks [10] and pre-trained language models [11] have also proven effective. Another line of work leverages propagation-based information diffusion patterns to encode information flow along user interaction edges.…”
Section: Related Workmentioning
confidence: 99%
“…Investigation of Spurious Correlations Despite the promising performance of deep learning models, reliance on dataset-related cues has been observed in a wide range of tasks including text classification [17], natural language inference [32] and visual question answering [30]. In fact-checking scenarios, language models can capture underlying identities of news sites [19], and rumor instances can possess time-sensitive characteristics [11]. Spurious artifacts lead to model failure on out-of-domain test instances, as empirically observed by [23,29,41].…”
Section: Related Workmentioning
confidence: 99%