With the ongoing debate on 'freedom of speech' vs. 'hate speech,' there is an urgent need to carefully understand the consequences of the inevitable culmination of the two, i.e., 'freedom of hate speech' over time. An ideal scenario to understand this would be to observe the effects of hate speech in an (almost) unrestricted environment. Hence, we perform the first temporal analysis of hate speech on Gab.com, a social media site with very loose moderation policy. We first generate temporal snapshots of Gab from millions of posts and users. Using these temporal snapshots, we compute an activity vector based on DeGroot model to identify hateful users. The amount of hate speech in Gab is steadily increasing and the new users are becoming hateful at an increased and faster rate. Further, our analysis analysis reveals that the hate users are occupying the prominent positions in the Gab network. Also, the language used by the community as a whole seem to correlate more with that of the hateful users as compared to the non-hateful ones. We discuss how, many crucial design questions in CSCW open up from our work.
Wikipedia can easily be justified as a behemoth, considering the sheer volume of content that is added or removed every minute to its several projects. This creates an immense scope, in the field of natural language processing toward developing automated tools for content moderation and review. In this paper we propose Self Attentive Revision Encoder (StRE) which leverages orthographic similarity of lexical units toward predicting the quality of new edits. In contrast to existing propositions which primarily employ features like page reputation, editor activity or rule based heuristics, we utilize the textual content of the edits which, we believe contains superior signatures of their quality. More specifically, we deploy deep encoders to generate representations of the edits from its text content, which we then leverage to infer quality. We further contribute a novel dataset containing ∼ 21M revisions across 32K Wikipedia pages and demonstrate that StRE outperforms existing methods by a significant margin -at least 17% and at most 103%. Our pretrained model achieves such result after retraining on a set as small as 20% of the edits in a wikipage. This, to the best of our knowledge, is also the first attempt towards employing deep language models to the enormous domain of automated content moderation and review in Wikipedia. 1 en.wikipedia.org/Wikipedia:List of policies 2 stats.wikimedia.org/EN/PlotsPngEditHistoryTop.htm
Millions of people irrespective of socioeconomic and demographic backgrounds, depend on Wikipedia articles everyday for keeping themselves informed regarding popular as well as obscure topics. Articles have been categorized by editors into several quality classes, which indicate their reliability as encyclopedic content. This manual designation is an onerous task because it necessitates profound knowledge about encyclopedic language, as well navigating circuitous set of wiki guidelines. In this paper we propose Neural wikipedia Quality Monitor (NwQM), a novel deep learning model which accumulates signals from several key information sources such as article text, meta data and images to obtain improved Wikipedia article representation. We present comparison of our approach against a plethora of available solutions and show 8% improvement over state-of-the-art approaches with detailed ablation studies.
Networks created from real-world data contain some inaccuracies or noise, manifested as small changes in the network structure. An important question is whether these small changes can significantly affect the analysis results.In this paper, we study the effect of noise in changing ranks of the high centrality vertices. We compare, using the Jaccard Index (JI), how many of the top-k high centrality nodes from the original network are also part of the top-k ranked nodes from the noisy network. We deem a network as stable if the JI value is high.We observe two features that affect the stability. First, the stability is dependent on the number of top-ranked vertices considered. When the vertices are ordered according to their centrality values, they group into clusters. Perturbations to the network can change the relative ranking within the cluster, but vertices rarely move from one cluster to another. Second, the stability is dependent on the local connections of the high ranking vertices. The network is highly stable if the high ranking vertices are connected to each other.Our findings show that the stability of a network is affected by the local properties of high centrality vertices, rather than the global properties of the entire network. Based on these local properties we can identify the stability of a network, without explicitly applying a noise model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.