Fast and accurate algorithms are necessary for Optical Character Recognition (OCR) systems to perform operations on document images such as pre-processing, segmentation, feature extraction, training and testing of classifiers and post processing. Text line and word segmentation are two important steps in any OCR system. Wrong segmentation may affect the accuracy rate of OCR systems. The segmentation is very challenging in cases of availability of different types of noises, degradations, and variation in writing and script characteristics. However, existing algorithms suffer from a flawed tradeoff between accuracy and speed. In this research work, Devanagri text line and word segmentation are carried out using modified standard profiling based segmentation approach and parallelized it on Graphics Processing Unit (GPU). The main goal of this research work is to make segmentation faster for processing a large number of document images using parallel implementation of algorithms on GPU. GPUs are emerging as powerful parallel systems at a cheaper cost. Our work employs extensive usage of highly multithreaded architecture and shared memory of multi-cored GPU. An efficient use of shared memory is required to optimize parallel reduction in Compute Unified Device Architecture (CUDA). Experimental results show that our method can achieve a speedup of about 20x-30x over the serial implementation when running on a GPU named GeForce 9500 GT having 32 cores.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.