Segmentation is an important task of any Optical Character Recognition (OCR) system. It separates the image text documents into lines, words and characters. The accuracy of OCR system mainly depends on the segmentation algorithm being used. Segmentation of handwritten text of some Indian languages like Kannada, Telugu, Assamese is difficult when compared with Latin based languages because of its structural complexity and increased character set. It contains vowels, consonants and compound characters. Some of the characters may overlap together. Despite several successful works in OCR all over the world, development of OCR tools in Indian languages is still an ongoing process. Character segmentation plays an important role in character recognition because incorrectly segmented characters are unlikely to be recognized correctly. In this paper, a segmentation scheme for segmenting handwritten Kannada scripts into lines, words and characters using morphological operations and projection profiles is proposed. The method was tested on totally unconstrained handwritten Kannada scripts, which pays more challenge and difficulty due to the complexity involved in the script. Usage of the morphology made extracting text lines efficient by an average extraction rate of 94.5% .Because of the varying inter and intra word gaps an average segmentation rate of 82.35% and 73.08% for words and characters respectively is obtained.
Abstract-combining classifiers appears as a natural step forward when a critical mass of knowledge of single classifier models has been accumulated. Although there are many unanswered questions about matching classifiers to real-life problems, combining classifiers is rapidly growing and enjoying a lot of attention from pattern recognition and machine learning communities. For any pattern classification task, an increase in data size, number of classes, dimension of the feature space and interclass separability affect the performance of any classifier. It is essential to know the effect of the training dataset size on the recognition performance of a feature extraction method and classifier. In this paper, an attempt is made to measure the performance of the classifier by testing the classifier with two different datasets of different sizes. In practical classification applications, if the number of classes and multiple feature sets for pattern samples are given, a desirable recognition performance can be achieved by data fusion. In this paper, we have proposed a framework based on the combined concepts of decision fusion and feature fusion for the isolated handwritten Kannada numerals classification. The proposed method improves the classification result. From the experimental results it is seen that there is an increase of 13.95% in the recognition accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.