Due to the mass availability of textual data on Web, text classification (TC), classifying texts into predetermined sets becomes a spotlight for researchers. A number of TC applications have been proposed yet very few studies reported an overview of TC research area in a proper and systematic manner. This paper aims to provide an overview of TC research trends and gaps by structuring and analyzing research patterns, encountered problems and problem-solving methods in TC. In other words, this study highlights problem types, data sources, choice of language of text and types of applied techniques in TC. An intensive systematic study is conducted by applying guidelines proposed by Petersen and colleagues in 2007. In this paper, ninety-six literatures from five electronic databases from 2006 to 2017 were systematically reviewed and followed each and every step properly in accordance with systematic mapping study. Nine main problems in TC research area were identified and significant findings which highlighted the evolution of TC research within the past 12 years were investigated. Different from other review articles, this paper highlighted issues and technical gaps of TC area in a useful and effective manner.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.