Summary
Social networking services are online platforms that are distributed across different computers over long distances. Twitter is the most popular microblogging site that allows users to share their opinions and real‐world events. Due to its popularity and ease of use, Twitter has also attracted spammers. As a result, spam detection is one of the most critical problems. In order to provide a spam‐free environment, it is necessary to identify and filter spam tweets as well as their owners. A hybrid method, which is based on Synthetic Minority Over‐sampling TEchnique (SMOTE) and Differential Evolution (DE) strategies, is presented to enhance the spam detection rate in real Twitter datasets. SMOTE is applied to tackle the imbalanced class distribution of datasets, while DE is used to tune Random Forest (RF) hyperparameters. Compared with related work and based on evaluation results, the presented method significantly enhances the classification performance in imbalanced datasets. The detection rate of optimized RF with excellent F1‐score and Area Under the Receiver Operating Characteristic Curve (AUROC), which are 98.97% and 0.999, respectively, demonstrates the high efficiency of the proposed method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.