Concept Drift Adaptive Physical Event Detection for Social Media Streams

Suprem, Abhijit; Musaev, Aibek; Pu, Calton

doi:10.1007/978-3-030-23381-5_7

Cited by 8 publications

(2 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…see Kouw and Loog (2018)). Works such as Suprem et al (2019) and Žliobaitė (2010) have used continuous/incremental learning to train models to respond to "concept drift". In the context of misinformation detection, what is defined as "in-domain" and "out-of-domain" can vary.…”

Section: Evaluation and Temporal Generalizabilitymentioning

confidence: 99%

Temporal Generalizability in Multimodal Misinformation Detection

Stepanova,

Ross

2023

Proceedings of the 1st GenBench Workshop on (Benchmarking) Generalisation in NLP

View full text Add to dashboard Cite

Misinformation detection models degrade in performance over time, but the precise causes of this remain under-researched, in particular for multimodal models. We present experiments investigating the impact of temporal shift on performance of multimodal automatic misinformation detection classifiers. Working with the r/Fakeddit dataset, we found that evaluating models on temporally out-of-domain data (i.e. data from time stretches unseen in training) results in a non-linear, 7-8% drop in macro F1 as compared to traditional evaluation strategies (which do not control for the effect of content change over time). Focusing on two factors that make temporal generalizability in misinformation detection difficult, content shift and class distribution shift, we found that content shift has a stronger effect on recall. Within the context of coarse-grained vs. fine-grained misinformation detection with r/Fakeddit, we find that certain misinformation classes seem to be more stable with respect to content shift (e.g. Manipulated and Misleading Content). Our results indicate that future research efforts need to explicitly account for the temporal nature of misinformation to ensure that experiments reflect expected real-world performance.

show abstract

Section: Evaluation and Temporal Generalizabilitymentioning

confidence: 99%

Temporal Generalizability in Multimodal Misinformation Detection

Stepanova,

Ross

2023

Proceedings of the 1st GenBench Workshop on (Benchmarking) Generalisation in NLP

View full text Add to dashboard Cite

show abstract

“…Concept drift occurs when testing or prediction data exhibits distribution shift [14], either in the data domain, or in the label domain [49]. Data domain shift can include introduction of new vocabularies, disappearance of existing words, and word polysemy [42]. Label domain shift occurs when the label space itself changes for the same type of data [22,35].…”

Section: Concept Driftmentioning

confidence: 99%

Evaluating Generalizability of Fine-Tuned Models for Fake News Detection

Suprem¹,

Pu²

2022

Preprint

Self Cite

View full text Add to dashboard Cite

The Covid-19 pandemic has caused a dramatic and parallel rise in dangerous misinformation, denoted an 'infodemic' by the CDC and WHO. Misinformation tied to the Covid-19 infodemic changes continuously; this can lead to performance degradation of fine-tuned models due to concept drift. Degredation can be mitigated if models generalize well-enough to capture some cyclical aspects of drifted data. In this paper, we explore generalizability of pre-trained and fine-tuned fake news detectors across 9 fake news datasets. We show that existing models often overfit on their training dataset and have poor performance on unseen data. However, on some subsets of unseen data that overlap with training data, models have higher accuracy. Based on this observation, we also present KMeans-Proxy, a fast and effective method based on K-Means clustering for quickly identifying these overlapping subsets of unseen data. KMeans-Proxy improves generalizability on unseen fake news datasets by 0.1-0.2 f1-points across datasets. We present both our generalizability experiments as well as KMeans-Proxy to further research in tackling the fake news problem.

show abstract