The quality of text features directly affects the text classification effect, in order to get the text features which have the high contribution to the text classification in class, this paper proposes a text classification method based on LDA model and category semantic similarity method. The method selects text document topic features by the LDA model and calculates the semantic similarity between these features and categories combined with the word vector model. According to the size of similarity, the weight of the text feature is obtained, and the text classification feature selection and text classification are realized. Finally, the feasibility, validity and correctness of the algorithm are verified by experiments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.