Fake news detection (FND) involves predicting the likelihood that a particular news article (news report, editorial, expose, etc.) is intentionally deceptive. Arabic FND started to receive more attention in the last decade, and many detection approaches demonstrated some ability to detect fake news on multiple datasets. However, most existing approaches do not consider recent advances in natural language processing, i.e., the use of neural networks and transformers. This paper presents a comprehensive comparative study of neural network and transformer-based language models used for Arabic FND. We examine the use of neural networks and transformer-based language models for Arabic FND and show their performance compared to each other. We also conduct an extensive analysis of the possible reasons for the difference in performance results obtained by different approaches. The results demonstrate that transformer-based models outperform the neural network-based solutions, which led to an increase in the F1 score from 0.83 (best neural network-based model, GRU) to 0.95 (best transformer-based model, QARiB), and it boosted the accuracy by 16% compared to the best in neural network-based solutions. Finally, we highlight the main gaps in Arabic FND research and suggest future research directions.
Frequent itemset mining (FIM) is a crucial tool for identifying hidden patterns in information. FP-Growth is an FIM algorithm used to find associations. When the data size increases, the execution of FIM algorithms on a single machine suffers from computational problems, such as memory and time consumption. For these reasons, parallel and distributed processing on platforms such as Spark is essential. The parallel frequent pattern (PFP) is the implementation of FP-Growth in Spark. The main problem with PFP is that it does not consider the load balancing between cluster units. This research proposes an enhanced balanced parallel frequent pattern ''EBPFP'' algorithm to enhance and balance the PFP. The proposed algorithm (EBPFP) proposes two ideas. First, a strategy for load balancing between groups is proposed to ensure that the items are evenly divided between the nodes, and the cluster resources are used more effectively. Second, the improved conditional pattern base (ICPB) method aims to remove infrequent items from the conditional pattern base before constructing local FP-Trees. The experimental results show that the proposed EBPFP algorithm outperforms PFP, and the difference in running time between EBPFP and PFP was 21.56% and 39.72%, respectively.INDEX TERMS Big data, data mining, association rule analysis, frequent pattern growth algorithm, spark, load balancing.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.