Summary
Traditional feature‐based semantic similarity (SS) approaches exploit the Wikipedia features in term of sets. They evaluate the similarity of concepts based on the commonalities among their feature sets. However, these feature‐based approaches treat all the features equally in similarity evaluation. Therefore, they ignore the underlying statistics of the features and consequently lose the essential semantic details about them. One solution is that each feature can be assigned a specific weight using its statistics. This weight will reflect the relative importance of a feature in similarity evaluation. Therefore, in this paper, based on two statistical models, ie, information content and TFIDF, we propose some hybrid semantic similarity measurement methods. Firstly, we propose some new methods called weighting functions to compute the weights of the features and feature sets in Wikipedia. Secondly, based on the weighting functions, we propose some new weighted feature‐based SS approaches for Wikipedia concepts. Thirdly, we evaluate the proposed methods on well‐known benchmarks for English, German, French, and Spanish languages. Finally, we compare the performance of our methods with the traditional feature‐based and some state‐of‐the‐art SS approaches. The experimental evaluation shows that our weighted methods perform better than the traditional feature‐based and some state‐of‐the‐art approaches in similarity evaluation.
Sentiment analysis based on social media text is found to be essential for multiple applications such as project design, measuring customer satisfaction, and monitoring brand reputation. Deep learning models that automatically learn semantic and syntactic information have recently proved effective in sentiment analysis. Despite earlier studies’ good performance, these methods lack syntactic information to guide feature development for contextual semantic linkages in social media text. In this paper, we introduce an enhanced LSTM-based on dependency parsing and a graph convolutional network (DPG-LSTM) for sentiment analysis. Our research aims to investigate the importance of syntactic information in the task of social media emotional processing. To fully utilize the semantic information of social media, we adopt a hybrid attention mechanism that combines dependency parsing to capture semantic contextual information. The hybrid attention mechanism redistributes higher attention scores to words with higher dependencies generated by dependency parsing. To validate the performance of the DPG-LSTM from different perspectives, experiments have been conducted on three tweet sentiment classification datasets, sentiment140, airline reviews, and self-driving car reviews with 1,604,510 tweets. The experimental results show that the proposed DPG-LSTM model outperforms the state-of-the-art model by 2.1% recall scores, 1.4% precision scores, and 1.8% F1 scores on sentiment140.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.