Aspect based Sentiment Analysis is a major subarea of sentiment analysis. Many supervised and unsupervised approaches have been proposed in the past for detecting and analyzing the sentiment of aspect terms. In this paper, a graph-based semi-supervised learning approach for aspect term extraction is proposed. In this approach, every identified token in the review document is classified as aspect or non-aspect term from a small set of labeled tokens using label spreading algorithm. The k-Nearest Neighbor (kNN) for graph sparsification is employed in the proposed approach to make it more time and memory efficient. The proposed work is further extended to determine the polarity of the opinion words associated with the identified aspect terms in review sentence to generate visual aspect-based summary of review documents. The experimental study is conducted on benchmark and crawled datasets of restaurant and laptop domains with varying value of labeled instances. The results depict that the proposed approach could achieve good result in terms of Precision, Recall and Accuracy with limited availability of labeled data.
In our work, we propose an ensemble of local and global filter-based feature selection method to reduce the high dimensionality of feature space and increase accuracy of spam review classification. These selected features are then used for training various classifiers for spam detection. Experimental results with four classifiers on two available datasets of hotel reviews show that the proposed feature selector improves the performance of spam classification in terms of well-known performance metrics such as AUC score.
The increase in the volume of opinion posted on social media sites has led to a tremendous increase in the dimensionality of data used for the sentiment analysis. The selection of informative features from textual data can improve the performance of supervised learning methods. In this article, we propose a novel and efficient method for integrating different filter-based feature selection methods for sentiment classification. The ensemble method utilizes hesitant fuzzy sets for representing opinions of different filter-based feature selection methods in order to optimize the relevancy score among features and class labels. Based on this relevancy score, top-k ranked features are selected for sentiment classification. The proposed feature selection method with Naï ve Bayes and Support Vector Machine classifiers was evaluated on three most widely used datasets for sentiment analysis using Unigram and Parts-of-Speech based text representation schemes. The performance is evaluated using five-fold cross validation technique and the results show that the proposed method can achieve greater value of accuracy with only 10-25% of total extracted features. The outcomes of comparison carried out via statistical tests confirm that the aggregation using hesitant fuzzy sets is more effective than baseline feature selection methods on Parts-of-Speech features in terms of performance metrics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.