In today's online services, users' feedback such as numerical rating, textual review, time of purchase, and so on for each item is often encouraged to provide. Managers of online services utilize the feedback to improve the quality of their services, or user experience. For example, many recommender systems predict the items that the users may like and purchase in the future using users' historical ratings. With the increase of user data in the systems, more detailed and interpretable information about item features and user sentiments can be extracted from textual reviews that are relative to ratings. In this paper, we propose a novel topic and sentiment matrix factorization model, which leverages both topic and sentiment drawn from the reviews simultaneously. First, we conduct topic analysis and sentiment analysis of reviews using Latent Dirichlet Allocation (LDA) and lexicon construction technique, respectively. Second, we combine the user consistency, which is calculated from his/her reviews and ratings, and helpful votes from other users of reviews to obtain a reliability measure to weight the ratings. Third, we integrate these three parts into the matrix factorization framework for the prediction of ratings. Our experimental comparison using Amazon datasets indicates that the proposed method significantly improves performance compared to traditional matrix factorization up to 14.12%.
Online social media has an exponential level of communication speed in terms of message dissemination. Users can publish comments freely to various web content on a characteristic network of communicators and viewers. Many of these comments contain emotions or opinions of users, which may cause sympathy and influence others’ comments. Moreover, such comments may raise social responses, i.e. they may cause drastic fluctuations in the number of comments. In this study, using the content of textual comments, we propose two structural approaches (PDFCPL and PDFCML) to predict the future drastic fluctuation in the number of comments based on Long Short-Term Memory (LSTM). To quantify each textual comment, we define two attributes: (1) relevance to its relevant topic based on cosine similarity and (2) importance of its content which is calculated by TF-IDF. The predictions are made by these attributes and the number of previously observed comments as well. To evaluate the performance of our approaches, we conduct comparing experiments with other methods on real data of Twitter. The results present that the proposed method PDFCPL has better performance than existing methods to predict the occurrence of drastic fluctuation in the number of comments.
In many online review sites or social media, each user is encouraged to assign a numeric rating and write a textual review as a feedback to each item that he had gotten, e.g., a product that he had bought, a place that he had visited, a service that he had received. Sometimes, feedbacks by some users would be affected by some contextual factors such as weather, distance, time, and season. Therefore, the context-aware approach is being developed by utilizing the user's contextual information to produce more precise recommendations than traditional approaches. Furthermore, previous works [6,14] have already approved the drawback of the ignorance of textual reviews would bring mediocre performance for rating prediction.In this work, we propose a framework TF+ for rating prediction models based on Tensor Factorization (TF) which is an extended version of Matrix Factorization (MF) by adding another dimension. We consider seasonal context as the additional dimension. Firstly, in our framework, each of the reviews is characterized by a numeric feature vector. Secondly, it uses TF which is trained by the proposed first-order gradient descent method for TF named Feature Vector Gradient Descent (FVGD). For the training of TF in TF+, FVGD decides the learning rates based on the feature vectors of reviews. In our evaluation, we use pre-processed data of five cities in YELP challenge dataset, and apply one of LDA, Doc2Vec and SCDV to get numeric feature vectors of reviews. We conduct experimental comparisons, and the results show that methods by TF+ improve the performance significantly as compared to the basic TF model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.