Online fringe communities offer fertile grounds to users seeking and sharing ideas fueling suspicion of mainstream news and conspiracy theories. Among these, the QAnon conspiracy theory emerged in 2017 on 4chan, broadly supporting the idea that powerful politicians, aristocrats, and celebrities are closely engaged in a global pedophile ring. Simultaneously, governments are thought to be controlled by "puppet masters, " as democratically elected officials serve as a fake showroom of democracy.This paper provides an empirical exploratory analysis of the QAnon community on Voat.co, a Reddit-esque news aggregator, which has captured the interest of the press for its toxicity and for providing a platform to QAnon followers. More precisely, we analyze a large dataset from /v/GreatAwakening, the most popular QAnon-related subverse (the Voat equivalent of a subreddit), to characterize activity and user engagement. To further understand the discourse around QAnon, we study the most popular named entities mentioned in the posts, along with the most prominent topics of discussion, which focus on US politics, Donald Trump, and world events. We also use word embeddings to identify narratives around QAnon-specific keywords. Our graph visualization shows that some of the QAnon-related ones are closely related to those from the Pizzagate conspiracy theory and so-called drops by "Q." Finally, we analyze content toxicity, finding that discussions on /v/GreatAwakening are less toxic than in the broad Voat community.CCS CONCEPTS social networks.
This paper presents a dataset with over 3.3M threads and 134.5M posts from the Politically Incorrect board (/pol/) of the imageboard forum 4chan, posted over a period of almost 3.5 years (June 2016-November 2019). To the best of our knowledge, this represents the largest publicly available 4chan dataset, providing the community with an archive of posts that have been permanently deleted from 4chan and are otherwise inaccessible. We augment the data with a set of additional labels, including toxicity scores and the named entities mentioned in each post. We also present a statistical analysis of the dataset, providing an overview of what researchers interested in using it can expect, as well as a simple content analysis, shedding light on the most prominent discussion topics, the most popular entities mentioned, and the toxicity level of each post. Overall, we are confident that our work will motivate and assist researchers in studying and understanding 4chan, as well as its role on the greater Web. For instance, we hope this dataset may be used for cross-platform studies of social media, as well as being useful for other types of research like natural language processing. Finally, our dataset can assist qualitative work focusing on in-depth case studies of specific narratives, events, or social theories.
A large number of the most-subscribed YouTube channels target children of very young age. Hundreds of toddler-oriented channels on YouTube feature inoffensive, well produced, and educational videos. Unfortunately, inappropriate content that targets this demographic is also common. YouTube's algorithmic recommendation system regrettably suggests inappropriate content because some of it mimics or is derived from otherwise appropriate content. Considering the risk for early childhood development, and an increasing trend in toddler's consumption of YouTube media, this is a worrisome problem. In this work, we build a classifier able to discern inappropriate content that targets toddlers on YouTube with 84.3% accuracy, and leverage it to perform a large-scale, quantitative characterization that reveals some of the risks of YouTube media consumption by young children. Our analysis reveals that YouTube is still plagued by such disturbing videos and its currently deployed counter-measures are ineffective in terms of detecting them in a timely manner. Alarmingly, using our classifier we show that young children are not only able, but likely to encounter disturbing videos when they randomly browse the platform starting from benign videos.
Voat.co was a news aggregator website that shut down on December 25, 2020. The site had a troubled history and was known for hosting various banned subreddits. This paper presents a dataset with over 2.3M submissions and 16.2M comments posted from 113K users in 7.1K subverses (the equivalent of subreddit for Voat). Our dataset covers the whole lifetime of Voat, from its developing period starting on November 8, 2013, the day it was founded, April 2014, up until the day it shut down (December 25, 2020). This work presents the largest and most complete publicly available Voat dataset, to the best of our knowledge. Along with the release of this dataset, we present a preliminary analysis covering posting activity and daily user and subverse registration on the platform so that researchers interested in our dataset can know what to expect. Our data may prove helpful to false news dissemination studies as we analyze the links users share on the platform, finding that many communities rely on alternative news press, like Breitbart and GatewayPundit, for their daily discussions. In addition, we perform network analysis on user interactions finding that many users prefer not to interact with subverses outside their narrative interests, which could be helpful to researchers focusing on polarization and echo chambers. Also, since Voat was one of the platforms banned Reddit communities migrated to, we are confident our dataset will motivate and assist researchers studying deplatforming. Finally, many hateful and conspiratorial communities were very popular on Voat, which makes our work valuable for researchers focusing on toxicity, conspiracy theories, cross-platform studies of social networks, and natural language processing.
The QAnon conspiracy theory claims that a cabal of (literally) blood-thirsty politicians and media personalities are engaged in a war to destroy society. By interpreting cryptic “drops” of information from an anonymous insider calling themself Q, adherents of the conspiracy theory believe that Donald Trump is leading them in an active fight against this cabal. QAnon has been covered extensively by the media, as its adherents have been involved in multiple violent acts, including the January 6th, 2021 seditious storming of the US Capitol building. Nevertheless, we still have relatively little understanding of how the theory evolved and spread on the Web, and the role played in that by multiple platforms. To address this gap, we study QAnon from the perspective of “Q” themself. We build a dataset of 4,949 canonical Q drops collected from six “aggregation sites,” which curate and archive them from their original posting to anonymous and ephemeral image boards. We expose that these sites have a relatively low (overall) agreement, and thus at least some Q drops should probably be considered apocryphal. We then analyze the Q drops’ contents to identify topics of discussion and find statistically significant indications that drops were not authored by a single individual. Finally, we look at how posts on Reddit are used to disseminate Q drops to wider audiences. We find that dissemination was (initially) limited to a few sub-communities and that, while heavy-handed moderation decisions have reduced the overall issue, the “gospel” of Q persists on the Web.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.