Due to the increasing numbers of Hadith forgeries, it has become necessary to use artificial intelligence to assist those looking for authentic Hadiths. This paper presents detailed research on ways to automatically detect Hadith authenticity in Arabic Hadith texts. It examines the utilization of deep learning-based and prediction by partial matching (PPM) compression-based classifiers, which have not been previously used in detecting Hadith authenticity. The proposed methods were compared with the most recent method used which is machine learning. In addition, there is a detailed description of the new Arabic Hadith corpus (non-authentic Hadith corpus) created for this study and the authors' experiments, which also used the Leeds University and King Saud University (LK) Hadith corpus. The experiments demonstrate that the authentication based on Isnad obtained accuracy ranging from 84% to 93%. The authentication based on Matan obtained an accuracy range of 55% to 93%, while the accuracy range for this experiment was from 55% to 85%, which means that Isnad is the most effective part of Hadith for automatically detecting authenticity. Moreover, the experiment proved that Matan can be used to judge Hadith authenticity with an accuracy of 85%. The study also showed that PPM and deep learning classifiers are effective means of automatically detecting authentic Hadith.
The occurrence of code-switching in online communication, when a writer switches among multiple languages, presents a challenge for natural language processing tools, since they are designed for texts written in a single language. To answer the challenge, this paper presents detailed research on ways to detect code-switching in Arabic text automatically. We compare the prediction by partial matching (PPM) compression-based classifier, implemented in Tawa, and a traditional machine learning classifier sequential minimal optimization (SMO), implemented in Waikato Environment for Knowledge Analysis, working specifically on Arabic text taken from Facebook. Three experiments were conducted in order to: (1) detect code-switching among the Egyptian dialect and English; (2) detect code-switching among the Egyptian dialect, the Saudi dialect, and English; and (3) detect code-switching among the Egyptian dialect, the Saudi dialect, Modern Standard Arabic (MSA), and English. Our experiments showed that PPM achieved a higher accuracy rate than SMO with 99.8% versus 97.5% in the first experiment and 97.8% versus 80.7% in the second. In the third experiment, PPM achieved a lower accuracy rate than SMO with 53.2% versus 60.2%. Code-switching between Egyptian Arabic and English text is easiest to detect because Arabic and English are generally written in different character sets. It is more difficult to distinguish between Arabic dialects and MSA as these use the same character set, and most users of Arabic, especially Saudis and Egyptians, frequently mix MSA with their dialects. We also note that the MSA corpus used for training the MSA model may not represent MSA Facebook text well, being built from news websites. This paper also describes in detail the new Arabic corpora created for this research and our experiments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.