Today, the rapid dissemination of information on digital platforms has seen the emergence of information pollution such as misinformation, disinformation, fake news, and different types of propaganda. Information pollution has become a serious threat to the online digital world and has posed several challenges to social media platforms and governments around the world. In this paper, we propose Propaganda Spotting in Online Urdu Language (ProSOUL)-a framework to identify content and sources of propaganda spread in the Urdu language. First, we develop a labelled dataset of 11,574 Urdu news to train the machine learning classifiers. Next, we develop the Linguistic Inquiry and Word Count (LIWC) dictionary to extract psycho-linguistic features of Urdu text. We evaluate the performance of different classifiers by varying n-gram, News Landscape (NELA), Word2Vec, and Bidirectional Encoder Representations from Transformers (BERT) features. Our results show that the combination of NELA, word n-gram, and character n-gram features outperform with 0.91 accuracy for Urdu text classification. In addition, Word2Vec embedding outperforms BERT features in classification of the Urdu text with 0.87 accuracy. Moreover, we develop and classify large scale Urdu content repositories to identify web sources spreading propaganda. Our results show that ProSOUL framework performs best for propaganda detection in the online Urdu news content compared to the general web content. To the best of our knowledge, this is the first study on the detection of propaganda content in the Urdu language.
The rapid adoption of online social media platforms has transformed the way of communication and interaction. On these platforms, discussions in the form of trending topics provide a glimpse of events happening around the world in real-time. Also, these trends are used for political campaigns, public awareness, and brand promotions. Consequently, these trends are sensitive to manipulation by malicious users who aim to mislead the mass audience. In this article, we identify and study the characteristics of users involved in the manipulation of Twitter trends in Pakistan. We propose 'Manipify' -a framework for automatic detection and analysis of malicious users for Twitter trends. Our framework consists of three distinct modules: i) user classifier, ii) hashtag classifier, and ii) trend analyzer. The user classifier introduces a novel approach to automatically detect manipulators using tweet content and user behaviour features. Also, the module classifies human and bot users. Next, the hashtag classifier categorizes trending hashtags into six categories assisting in examining manipulators behaviour across different categories. Finally, the trend analyzer module examines users, hashtags, and tweets for hashtag reach, linguistic features and user behaviour. Our user classifier module achieves 0.91 accuracy in classifying the manipulators. We further test Manipify on the dataset comprising of 665 trending hashtags with 5.4 million tweets and 1.9 million users. The analysis of trends reveals that the trending panel is mostly dominated by political hashtags. In addition, our results show a higher contribution of human accounts in trend manipulation as compared to bots. Furthermore, we present two case studies of hashtag-wars and anti-state propaganda to implicate the real-world application of our research.
Twitter trends have enabled the speedy dissemination of information with the ability to affect public opinion. Unfortunately, fake trends are also generated by malicious users to mislead the public. In general, Twitter users are studied in depth to identify humans, bots, spam, and fake accounts. However, artificial intelligence algorithms are not developed for the identification of 'trend promoters' generating fake trends. In this paper, we propose Push-To-Trend -a novel framework to detect 'trend promoters' in trending hashtags. For this purpose, first, we develop a dataset of TREP-21 containing 3, 900 users labelled into two categories of 'trend promoters' and 'normal users'. In addition, we design four discerning features of number of total tweets, duplicate tweets, overlapping ngram, and peak-to-mean ratio for trend promoters classification. Moreover, we thoroughly examine the features used for spam and bot accounts classification to filter three efficacious features for trend promoters identification. Leveraging these seven features, Push-To-Trend achieves the accuracy of 0.97 for TREP-21. Furthermore, we leverage our framework to identify and analyze trend promoters from the Urdu tweets repository ''Anbar'' which consists of 106.9 million tweets and 1.69 million users. The analysis of 602 most frequent hashtags in Anbar reveals that 15.7% of trend promoters generate 68.1% of total tweets related to hashtags. To the best of our knowledge, this is the first attempt to design machine learning models for the automatic classification of trend promoters. As such, our framework is generic and adaptable for tweets posted in different natural languages as it utilizes languageindependent features. INDEX TERMSTwitter trends, trend promoters, social media user classification, Twitter analytics.
The rapid adoption of online social media platforms has transformed the way of communication and interaction. On these platforms, discussions in the form of trending topics provide a glimpse of events happening around the world in real-time. Also, these trends are used for political campaigns, public awareness, and brand promotions. Consequently, these trends are sensitive to manipulation by malicious users who aim to mislead the mass audience. In this article, we identify and study the characteristics of users involved in the manipulation of Twitter trends in Pakistan. We propose "Manipify"-a framework for automatic detection and analysis of malicious users in Twitter trends. Our framework consists of three distinct modules: (1) user classifier, (2) hashtag classifier, and (3) trend analyzer. The user classifier module introduces a novel approach to automatically detect manipulators using tweet content and user behaviour features. Also, the module classifies human and bot users. Next, the hashtag classifier categorizes trending hashtags into six categories assisting in examining manipulators behaviour across different categories. Finally, the trend analyzer module examines users, hashtags, and tweets for hashtag reach, linguistic features, and user behaviour.Our user classifier module achieves 0.92 and 0.98 accuracy in classifying manipulators and bots, respectively.We further test Manipify on the dataset comprising 652 trending hashtags with 5.4 million tweets and 1.9 million users. The analysis of trends reveals that the trending panel is mostly dominated by political hashtags.In addition, our results show a higher contribution of human accounts in trend manipulation as compared to bots.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.