Sentiment analysis is widely studied to extract opinions from user generated content (UGC), and various methods have been proposed in recent literature. However, these methods are likely to introduce sentiment bias, and the classification results tend to be positive or negative, especially for the lexicon-based sentiment classification methods. The existence of sentiment bias leads to poor performance of sentiment analysis. To deal with this problem, we propose a novel sentiment bias processing strategy which can be applied to the lexicon-based sentiment analysis method. Weight and threshold parameters learned from a small training set are introduced into the lexicon-based sentiment scoring formula, and then the formula is used to classify the reviews. In this paper, a completed sentiment classification framework is proposed. SentiWordNet (SWN) is used as the experimental sentiment lexicon, and review data of four products collected from Amazon are used as the experimental datasets. Experimental results show that the bias processing strategy reduces polarity bias rate (PBR) and improves performance of the lexicon-based sentiment analysis method.
Analyzing massive user-generated microblogs is very crucial in many fields, attracting many researchers to study. However, it is very challenging to process such noisy and short microblogs. Most prior works only use texts to identify sentiment polarity and assume that microblogs are independent and identically distributed, which ignore microblogs are networked data. Therefore, their performance is not usually satisfactory. Inspired by two sociological theories (sentimental consistency and emotional contagion), in this paper, we propose a new method combining social context and topic context to analyze microblog sentiment. In particular, different from previous work using direct user relations, we introduce structure similarity context into social contexts and propose a method to measure structure similarity. In addition, we also introduce topic context to model the semantic relations between microblogs. Social context and topic context are combined by the Laplacian matrix of the graph built by these contexts and Laplacian regularization are added into the microblog sentiment analysis model. Experimental results on two real Twitter datasets demonstrate that our proposed model can outperform baseline methods consistently and significantly.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.