Script identification at character level in handwritten documents is a challenging task for Gurumukhi and Latin scripts due to the presence of slightly similar, quite similar or at times confusing character pairs. Hence, it is found to be inadequate to use single feature set or just traditional feature sets and classifier in processing the handwritten documents. Due to the evolution of deep learning, the importance of traditional feature extraction approaches is somewhere neglected which is considered in this paper. This paper investigates machine learning and deep learning ensemble approaches at feature extraction and classification level for script identification. The approach here is: i. combining traditional and deep learning based features ii. evaluating various ensemble approaches using individual and combined feature sets to perform script identification iii. evaluating the pre-trained deep networks using transfer learning for script identification ’iv. finding the best combination of feature set and classifiers for script identification. Three different kinds of traditional features like Gabor filter, Gray Level Co-Occurrence Matrix (GLCM), Histograms of Oriented Gradiants (HOG) are employed. For deep learning pretrained deep networks like VGG19, ResNet50 and LeNet5 have been used as feature extractor. These individual and combined features are trained using classifiers
like Support Vector Machines (SVM) , K nearest neighbor (KNN), Random Forest (rf) etc. Further many ensemble approaches like Voting,Boosting and Bagging are evaluated for script classification. Exhaustive experimental work resulted into the highest accuracy of 98.82% with features extracted from ResNet50 using transfer learning and bagging based ensemble classifier which is higher as compared to previously reported work.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.