In this work a new method to enhance and binarize document images with several kind of degradation is proposed. The method is based on the idea that by the absolute difference between a document image and its background it is possible to effectively emphasize the text and attenuate degraded regions. To generate the background of a document our work was inspired on the human visual system and on the perception of objects by distance. Snellen's visual acuity notation was used to define how far an image must be from an observer so that the details of the characters are not perceived anymore, remaining just the background. A scheme that combines k-means clustering algorithm and Otsu's thresholding method is also used to perform binarization. The proposed method has been tested on two different datasets of document images (DIBCO 2011 and a real historical document image dataset) with very satisfactory results.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.