Sarcasm emerges as a common phenomenon across social networking sites because people express their negative thoughts, hatred and opinions using positive vocabulary which makes it a challenging task to detect sarcasm. Although various studies have investigated the sarcasm detection on baseline datasets, this work is the first to detect sarcasm from a multi-domain dataset that is constructed by combining Twitter and News Headlines datasets. This study proposes a hybrid approach where the convolutional neural networks (CNN) are used for feature extraction while the long short-term memory (LSTM) is trained and tested on those features. For performance analysis, several machine learning algorithms such as random forest, support vector classifier, extra tree classifier and decision tree are used. The performance of both the proposed model and machine learning algorithms is analyzed using the term frequency-inverse document frequency, bag of words approach, and global vectors for word representations. Experimental results indicate that the proposed model surpasses the performance of the traditional machine learning algorithms with an accuracy of 91.60%. Several state-of-the-art approaches for sarcasm detection are compared with the proposed model and results suggest that the proposed model outperforms these approaches concerning the precision, recall and F1 scores. The proposed model is accurate, robust, and performs sarcasm detection on a multi-domain dataset.
Sentiment analysis is the extraction and categorization of sentiments that have been expressed in text data using text analysis techniques. Manifested by earlier studies, sentiment analysis of drug reviews has a large potential for providing valuable insights to assist health-care professionals and companies for evaluating the safety of drugs after it has been marketed. Such insights help safeguard patients and increase their trust in medical companies. The existing systems either follow a lexicon-based approach or a learningbased approach for sentiment analysis in the medical domain. Learning-based techniques require annotated data while lexicon-based techniques tend to be domain-specific which restricts their wide use. This research embarks on a hybrid technique that utilizes both learning-based and lexicon-based approaches to achieve better results. General-purpose sentiment lexicons, such as AFFIN, TextBlob, and VADER, are used for annotating the reviews. Furthermore, several feature engineering techniques, such as term frequency (TF), term frequency-inverse document frequency (TF-IDF), and union of TF and TF-IDF (TF U TF-IDF) have been incorporated for the extraction of useful features. Finally, the learning models including logistic regression (LR), AdaBoost classifier (AB), random forest (RF), extra tree classifier (ETC), and multilayer perceptron (MLP) are used to classify sentiments of the reviews. The performance of the proposed hybrid approach is evaluated using accuracy, precision, recall, and F1-score. Experimental results indicate that the combination of learning-based and lexicon-based approaches provide improved results than their individual use. Moreover, TextBlob has shown promising results giving an accuracy of 96% with MLP when used with TF-IDF and with LR when used with TF U TF-IDF.
Vaccination for the COVID-19 pandemic has raised serious concerns among the public and various rumours are spread regarding the resulting illness, adverse reactions, and death. Such rumours can damage the campaign against the COVID-19 and should be dealt with accordingly. One prospective solution is to use machine learning-based models to predict the death risk for vaccinated people by utilizing the available data. This study focuses on the prognosis of three significant events including ‘not survived’, ‘recovered’, and ‘not recovered’ based on the adverse events followed by the second dose of the COVID-19 vaccine. Extensive experiments are performed to analyse the efficacy of the proposed Extreme Regression- Voting Classifier model in comparison with machine learning models with Term Frequency-Inverse Document Frequency, Bag of Words, and Global Vectors, and deep learning models like Convolutional Neural Network, Long Short Term Memory, and Bidirectional Long Short Term Memory. Experiments are carried out on the original, as well as, a balanced dataset using Synthetic Minority Oversampling Approach. Results reveal that the proposed voting classifier in combination with TF-IDF outperforms with a 0.85 accuracy score on the SMOTE-balanced dataset. In line with this, the validation of the proposed voting classifier on binary classification shows state-of-the-art results with a 0.98 accuracy.
COVID-19 vaccination raised serious concerns among the public and people are mind stuck by various rumors regarding the resulting illness, adverse reactions, and death. Such rumors are dangerous to the campaign against the COVID-19 and should be dealt with accordingly and timely. One prospective solution is to use machine learning-based models to predict the death risk for vaccinated people and clarify people’s perceptions regarding death risk. This study focuses on the prediction of the death risks associated with vaccinated people followed by a second dose for two reasons; first to build consensus among people to get the vaccines; second, to reduce the fear regarding vaccines. Given that, this study utilizes the COVID-19 VAERS dataset that records adverse events after COVID-19 vaccination as ‘recovered’, ‘not recovered’, and ‘survived’. To obtain better prediction results, a novel voting classifier extreme regression-voting classifier (ER-VC) is introduced. ER-VC ensembles extra tree classifier and logistic regression using soft voting criterion. To avoid model overfitting and get better results, two data balancing techniques synthetic minority oversampling (SMOTE) and adaptive synthetic sampling (ADASYN) have been applied. Moreover, three feature extraction techniques term frequency-inverse document frequency (TF-IDF), bag of words (BoW), and global vectors (GloVe) have been used for comparison. Both machine learning and deep learning models are deployed for experiments. Results obtained from extensive experiments reveal that the proposed model in combination with TF-TDF has shown robust results with a 0.85 accuracy when trained on the SMOTE-balanced dataset. In line with this, validation of the proposed voting classifier on binary classification shows state-of-the-art results with a 0.98 accuracy. Results show that machine learning models can predict the death risk with high accuracy and can assist the authors in taking timely measures.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.