Urdu word spotting is among the most challenging tasks in image processing and word spotting of hand written Urdu text is even more so. When it comes to handwritten Urdu documents, variation among the same words of various writers is significant. The orientation and style of the handwriting makes it really challenging for a word spotting system to correctly recognize the instances of the keyword. In this research, we tend to overcome this hurdle. We propose a system that takes a database of hand written Urdu text and generates random, yet, similar images to improve the classifier's ability to recognize variations caused by difference in handwriting. For image generation, we used geometric transformations and variants of Generative Adversarial Network (GAN). For the word spotting process, Histogram of Oriented Gradients (HOG) features are extracted from ligature images and then used to train a Long Short-Term Memory (LSTM) network for the classification task. This is the first study that focuses on improving word spotting by generating arbitrary samples using GANs and its variants. The system achieved a promising recognition rate of 98.96% due to the sample generation using Cycle-GANs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.