Text document recognition systems perform well in the case of printed documents but fails to produce similar results for handwritten text documents. The significant challenges include different writing styles, various background complexities, added noise of image acquisition methods, and the presence of deformed text images such as strike-offs and underlines. Any deformity can be posed as a change in structural information, resulting in intensity variations of the original text. The restoration of deformed images aims to recover clean images while maintaining the structural information and preserving the semantic dependencies of the local pixels. The current adversarial networks are unable to preserve the structural and semantic dependencies as they consider each individual pixel-to-pixel variations and encourage the perceptually non-meaningful aspects of the images. We propose a Variable Cycle Generative adversarial network (VCGAN) to consider the perceptual quality of the images in the learning objective which is based on the variable content loss to preserve the dependencies. We propose a Top-k Variable Loss (TV-k) to compute the similarity of images by accounting the intensity variations that do not interfere with image semantic structures. The results show that VCGAN is able to remove most of the deformities with an elevated F1 score of 97.40%. We also tested the images generated by VCGAN with a handwritten text recognition system. VCGAN outperforms the current state of the art algorithms with a character error rate of 7.64% and word accuracy of 81.53%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.