This manuscript presents the study and application of the method of principal component analysis (PCA) in the field of text mining. We began by studying the theoretical basis behind this method and we have focused on two of its variants namely the neural PCA and kernel PCA. We used neural PCA for automatic categorization of text documents through an extraction of semantic concepts. The second contribution of our work is the use of PCA (neuronal and kernel) for the dimension reduction of textual documents through the automatic classification.
Abstract. Regarding the huge amount of products, sites, information, etc., finding the appropriate need of a user is a very important task. Recommendation Systems (RS) guide users in a personalized way to objects of interest within a large space of possible options. This paper presents an algorithm for recommending movies. We break the recommendation task into two steps: (1) Grouping Like-Minded users, and (2) create model for each group to predict user-movie ratings. In the first step we use the Principal Component Analysis to retrieve latent groups of similar users. In the second step, we employ three different regression algorithms to build models and predict ratings. We evaluate our results against the SVD++ algorithm and validate the results by employing the MAE and RMSE measures. The obtained results show that the algorithm presented gives an improvement in the MAE of about 0.42 and 0.5201 in the RMSE.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.