Fake news is a complex problem that leads to different approaches used to identify them. In our paper, we focus on identifying fake news using its content. The used dataset containing fake and real news was pre-processed using syntactic analysis. Dependency grammar methods were used for the sentences of the dataset and based on them the importance of each word within the sentence was determined. This information about the importance of words in sentences was utilized to create the input vectors for classifications. The paper aims to find out whether it is possible to use the dependency grammar to improve the classification of fake news. We compared these methods with the TfIdf method. The results show that it is possible to use the dependency grammar information with acceptable accuracy for the classification of fake news. An important finding is that the dependency grammar can improve existing techniques. We have improved the traditional TfIdf technique in our experiment.
No abstract
This research proposes a novel technique for fake news classification using natural language processing (NLP) methods. The proposed technique, TwIdw (Term weight–inverse document weight), is used for feature extraction and is based on TfIdf, with the term frequencies replaced by the depth of the words in documents. The effectiveness of the TwIdw technique is compared to another feature extraction method—basic TfIdf. Classification models were created using the random forest and feedforward neural networks, and within those, three different datasets were used. The feedforward neural network method with the KaiDMML dataset showed an increase in accuracy of up to 3.9%. The random forest method with TwIdw was not as successful as the neural network method and only showed an increase in accuracy with the KaiDMML dataset (1%). The feedforward neural network, on the other hand, showed an increase in accuracy with the TwIdw technique for all datasets. Precision and recall measures also confirmed good results, particularly for the neural network method. The TwIdw technique has the potential to be used in various NLP applications, including fake news classification and other NLP classification problems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.