Abstract:The exponential increase in online reviews and recommendations makes document classification and sentiment analysis a hot topic in academic and industrial research. Traditional deep learning based document classification methods require the use of full textual information to extract features. In this paper, in order to tackle long document, we proposed three methods that use local convolutional feature aggregation to implement document classification. The first proposed method randomly draws blocks of continuous words in the full document. Each block is then fed into the convolution neural network to extract features and then are concatenated together to output the classification probability through a classifier. The second model improves the first by capturing the contextual order information of the sampled blocks with a recurrent neural network. The third model is inspired by the recurrent attention model (RAM), in which a reinforcement learning module is introduced to act as a controller for selecting the next block position based on the recurrent state. Experiments on our collected four-class arXiv paper dataset show that the three proposed models all perform well, and the RAM model achieves the best test accuracy with the least information.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.