2019
DOI: 10.48550/arxiv.1911.03854
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

r/Fakeddit: A New Multimodal Benchmark Dataset for Fine-grained Fake News Detection

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
41
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 23 publications
(42 citation statements)
references
References 15 publications
1
41
0
Order By: Relevance
“…A major challenge with many of the approaches mentioned above is the need for more inclusive large-scale public datasets to allow NLP researchers to explore and work with low-resource languages and alternative modalities. There are promising endeavors to curate datasets of misinformation such as the Fakeddit dataset [26], Poynter and COVIDLies [17], as well as as large community modeling efforts such as the FastAI Model Zoo, which could be expanded to cover a wider range of regions and languages. Similarly, efforts to build datasets from a wider range of sources -e.g., via the creation of radio transcription pipelines -could encourage researchers to work with data from alternative modalities.…”
Section: Discussionmentioning
confidence: 99%
“…A major challenge with many of the approaches mentioned above is the need for more inclusive large-scale public datasets to allow NLP researchers to explore and work with low-resource languages and alternative modalities. There are promising endeavors to curate datasets of misinformation such as the Fakeddit dataset [26], Poynter and COVIDLies [17], as well as as large community modeling efforts such as the FastAI Model Zoo, which could be expanded to cover a wider range of regions and languages. Similarly, efforts to build datasets from a wider range of sources -e.g., via the creation of radio transcription pipelines -could encourage researchers to work with data from alternative modalities.…”
Section: Discussionmentioning
confidence: 99%
“…analysis is beneficial in contrast to monomodal processing. Our evaluation, conducted on a large-scale multimodal real-world dataset from Reddit [2], shows that multimodal processing strongly improves detection results. This leads us to two conclusions: (i) all modalities can provide useful clues for the detection of fake news and (ii) the proposed multilevel hierarchical information fusion allows to successfully capture information from all modalities.…”
Section: Introductionmentioning
confidence: 93%
“…Table I lists different approaches developed for information disorder detection, including approaches for mis-and disinformation detection, rumor verification, and fake news detection. The related literature can be split into two groups, monomodal approaches [3]- [5] and multimodal approaches [2], [6]- [12].…”
Section: Related Workmentioning
confidence: 99%
“…Parallel and sequential ensembling techniques have been widely studied and shown to improve performance of models in a variety of tasks [9]- [16]. Some works combine latent embeddings from different input modalities (feature fusion) and train the entire model together in a joint fashion [47]. Boosting algorithms such as Adaptive Boost [12], Gradient Boosting [13], and XG Boost [14] are common sequential ensembling techniques.…”
Section: B Ensemble Techniquesmentioning
confidence: 99%