As a common language form in oral communication, short text is hard to be used in the applications such as intent understanding, text classification and so on due to its limited content and information, as well as irregular expression and missing components. To increase the availability of short texts in real applications, we propose a Label Information Assisting-based Model (LIAM) for Chinese short text classification. In the model, we jointly use sentence-level features and word-level features to reduce text information loss. And the sentence-level features are fused with relevant label information by the Label Information Extending and Fusion (LIEF) module while the word-level features are also enhanced with assistance of relevant label information. By utilizing the text-related information from labels as extended information, the model enriches and enhances the features of short text, benefiting classification. To verify the correctness and effectiveness of the proposed method, we conduct extensive experiments on four Chinese datasets and six sub-datasets with different models. The experimental results show that LIAM presented can effectively enrich information for text and much improve the performance of short text classification. It performs much better than other methods do. What is more, the less the training set, the greater the advantages of the model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.