Social media and easy internet access have allowed the instant sharing of news, ideas, and information on a global scale. However, rapid spread and instant access to information/news can also enable rumors or fake news to spread very easily and rapidly. In order to monitor and minimize the spread of fake news in the digital community, fake news detection using Natural Language Processing (NLP) has attracted significant attention. In NLP, different text feature extractors and word embeddings are used to process the text data. The aim of this paper is to analyze the performance of a fake news detection model based on neural networks using 3 feature extractors: TD-IDF vectorizer, Glove embeddings, and BERT embeddings. For the evaluation, multiple metrics, namely accuracy, precision, F1, recall, AUC ROC, and AUC PR were computed for each feature extractor. All the transformation techniques were fed to the deep learning model. It was found that BERT embeddings for text transformation delivered the best performance. TD-IDF has been performed far better than Glove and competed the BERT as well at some stages.
AbstractNowadays, sentiment analysis is a method used to analyze the sentiment of the feedback given by a user in an online document, such as a blog, comment, and review, and classifies it as negative, positive, or neutral. The classification process relies upon the analysis of the polarity features of the natural language text given by users. Polarity analysis has been an important subtask in sentiment analysis; however, detecting correct polarity has been a major issue. Different researchers have utilized different polarity features, such as standard part-of-speech (POS) tags such as adjectives, adverbs, verbs, and nouns. However, there seems to be a lack of research focusing on the subcategories of these tags. The aim of this research was to propose a method that better recognizes the polarity of natural language text by utilizing different polarity features using the standard POS category and the subcategory combinations in order to explore the specific polarity of text. Several experiments were conducted to examine and compare the efficacies of the proposed method in terms of F-measure, recall, and precision using an Amazon dataset. The results showed that JJ + NN + VB + RB + VBP + RP, which is a POS subcategory combination, obtained better accuracy compared to the baseline approaches by 4.4% in terms of F-measure.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.