Modeling user engagement dynamics on social media has compelling applications in market trend analysis, user-persona detection, and political discourse mining. Most existing approaches depend heavily on knowledge of the underlying user network. However, a large number of discussions happen on platforms that either lack any reliable social network (news portal, blogs, Buzzfeed) or reveal only partially the inter-user ties (Reddit, Stackoverflow). Many approaches require observing a discussion for some considerable period before they can make useful predictions. In real-time streaming scenarios, observations incur costs. Lastly, most models do not capture complex interactions between exogenous events (such as news articles published externally) and in-network effects (such as follow-up discussions on Reddit) to determine engagement levels.To address the three limitations noted above, we propose a novel framework, ChatterNet, which, to our knowledge, is the first that can model and predict user engagement without considering the underlying user network. Given streams of timestamped news articles and discussions, the task is to observe the streams for a short period leading up to a time horizon, then predict chatter: the volume of discussions through a specified period after the horizon.ChatterNet processes text from news and discussions using a novel time-evolving recurrent network architecture that captures both temporal properties within news and discussions, as well as influence of news on discussions. We report on extensive experiments using a two-month-long discussion corpus of Reddit, and a contemporaneous corpus of online news articles from the Common Crawl. ChatterNet shows considerable improvements beyond recent state-of-the-art models of engagement prediction. Detailed studies controlling observation and prediction windows, over 43 different subreddits, yield further useful insights.
Online hate speech, particularly over microblogging platforms like Twitter, has emerged as arguably the most severe issue of the past decade. Several countries have reported a steep rise in hate crimes infuriated by malicious hate campaigns. While the detection of hate speech is one of the emerging research areas, the generation and spread of topic-dependent hate in the information network remain under-explored. In this work, we focus on exploring user behavior, which triggers the genesis of hate speech on Twitter and how it diffuses via retweets. We crawl a large-scale dataset of tweets, retweets, user activity history, and follower networks, comprising over 161 million tweets from more than 41 million unique users. We also collect over 600k contemporary news articles published online. We characterize different signals of information that govern these dynamics. Our analyses differentiate the diffusion dynamics in the presence of hate from usual information diffusion. This motivates us to formulate the modeling problem in a topic-aware setting with real-world knowledge. For predicting the initiation of hate speech for any given hashtag, we propose multiple feature-rich models, with the best performing one achieving a macro F1 score of 0.65. Meanwhile, to predict the retweet dynamics on Twitter, we propose RETINA, a novel neural architecture that incorporates exogenous influence using scaled dot-product attention. RETINA achieves a macro F1-score of 0.85, outperforming multiple stateof-the-art models. Our analysis reveals the superlative power of RETINA to predict the retweet dynamics of hateful content compared to the existing diffusion models.
With the ease of access to information, and its rapid dissemination over the internet (both velocity and volume), it has become challenging to filter out truthful information from fake ones. The research community is now faced with the task of automatic detection of fake news, which carries real-world socio-political impact. One such research contribution came in the form of the Constraint@AAA12021 Shared Task on COVID19 Fake News Detection in English. In this paper, we shed light on a novel method we proposed as a part of this shared task. Our team introduced an approach to combine topical distributions from Latent Dirichlet Allocation (LDA) with contextualized representations from XLNet. We also compared our method with existing baselines to show that XLNet + Topic Distributions outperforms other approaches by attaining an F1-score of 0.967.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.