<p><span lang="EN-US">Recognition of degraded printed compound Kannada characters is a challenging research problem. It has been verified experimentally that noise removal is an essential preprocessing step. Proposed are two methods for degraded Kannada character recognition problem. Method 1 is conventionally used histogram of oriented gradients (HOG) feature extraction for character recognition problem. Extracted features are transformed and reduced using principal component analysis (PCA) and classification performed. Various classifiers are experimented with. Simple compound character classification is satisfactory (more than 98% accuracy) with this method. However, the method does not perform well on other two compound types. Method 2 is deep convolutional neural networks (CNN) model for classification. This outperforms HOG features and classification. The highest classification accuracy is found as 98.8% for simple compound character classification. The performance of deep CNN is far better for other two compound types. Deep CNN turns out to better for pooled character classes.</span></p>
This paper addresses preparation of a dataset of Kannada characters which are degraded and robust recognition of such characters. The proposed recognition algorithm extracts the histogram of oriented gradients (HOG) features of block sizes 4x4 and 8x8 followed by principal component analysis (PCA) feature reduction. Various classifiers are experimented with and fine K-nearest neighbor classifier performs best. The performance of proposed model is evaluated using 5-fold cross validation method and receiver operating characteristic curve. The dataset devised is of size 10440 characters having 156 classes (distinct characters). These characters are from 75 pages of not well preserved old books. A comparison of proposed model with other features like Haar wavelet and Geometrical features suggests that proposed model is superior. It is observed that the PCA reduced features followed by fine K-nearest neighbor classifier resulted in the best accuracy with acceptance rate of 98.6% and 97.9% for block sizes of 4x4 and 8x8 respectively. The experimental results show that HOG feature extraction has a high recognition rate and the system is robust even with extensively degraded characters.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.