With the rapid increase of public opinion data, the technology of Weibo text sentiment analysis plays a more and more significant role in monitoring network public opinion. Due to the sparseness and high-dimensionality of text data and the complex semantics of natural language, sentiment analysis tasks face tremendous challenges. To solve the above problems, this paper proposes a new model based on BERT and deep learning for Weibo text sentiment analysis. Specifically, first using BERT to represent the text with dynamic word vectors and using the processed sentiment dictionary to enhance the sentiment features of the vectors; then adopting the BiLSTM to extract the contextual features of the text, the processed vector representation is weighted by the attention mechanism. After weighting, using the CNN to extract the important local sentiment features in the text, finally the processed sentiment feature representation is classified. A comparative experiment was conducted on the Weibo text dataset collected during the COVID-19 epidemic; the results showed that the performance of the proposed model was significantly improved compared with other similar models.
In the face of massive texts, dimensionality reduction algorithm and efficient classification model have become the key steps for sentiment classification of microblogs. X2 statistics and TF-IDF statistics are commonly used dimension reduction methods. When applied to micro-blog sentiment analysis, traditional X2 statistics do not consider the probability of a certain sentiment word in a micro-blog text. TF-IDF weight measure ignores the synonyms in the text of micro-blog. Therefore, this paper proposes a NewChi-TF-IDF feature selection method combining form and semantics. In the classification stage, the generalization performance of single classifier is low. To enhance the generalization performance of Weibo sentiment classification, based on the existing ensemble strategy, differential evolution algorithm is introduced to assign different excitation functions to multiple weak classifiers to train the optimal weight distribution. Thus, the problem that the weight of weak classifier is difficult to determine is solved. Experimental results show that NewChi-TF-IDF feature selection method reduces the dimension, and the generalization ability of the proposed algorithm is enhanced, and the average F-score of the proposed algorithm is improved to a higher degree than that of Ada-All and Vote-All classifiers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.