2009 10th International Conference on Document Analysis and Recognition 2009
DOI: 10.1109/icdar.2009.88
|View full text |Cite
|
Sign up to set email alerts
|

Manuscript Bleed-through Removal via Hysteresis Thresholding

Abstract: Many types of degradation can render ancient manuscripts very hard to read. In bleed-through, the text from the reverse, or verso, side of a page seeps through into the front, or recto. In this paper, we propose hysteresis thresholding to greatly reduce bleed-through. Thresholding alone cannot properly separate ink and bleed-through because the ranges of intensities for the two classes overlap. Hysteresis thresholding overcomes this limitation via the two steps of thresholding and ink regrowth. In order to pro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
0

Year Published

2012
2012
2018
2018

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 27 publications
(14 citation statements)
references
References 13 publications
0
14
0
Order By: Relevance
“…Blind methods are mostly based on the assumption that there are three distinct intensity groups in the degraded images -the darkest region corresponding to foreground, the brightest to background, and bleed-through somewhere in between. There are many approaches available for segmentation based on intensity, for example, hysteresis thresholding [5], iterative K-means clustering and principle component analysis (PCA) [3], independent component analysis (ICA), either on the colour channels [15], or different colour space images [14]. However, the main issue with using intensity information only is that this will not be sufficient in severe cases where the bleed-through is equivalent in intensity to the foreground text.…”
Section: Bleed-through Removalmentioning
confidence: 99%
See 1 more Smart Citation
“…Blind methods are mostly based on the assumption that there are three distinct intensity groups in the degraded images -the darkest region corresponding to foreground, the brightest to background, and bleed-through somewhere in between. There are many approaches available for segmentation based on intensity, for example, hysteresis thresholding [5], iterative K-means clustering and principle component analysis (PCA) [3], independent component analysis (ICA), either on the colour channels [15], or different colour space images [14]. However, the main issue with using intensity information only is that this will not be sufficient in severe cases where the bleed-through is equivalent in intensity to the foreground text.…”
Section: Bleed-through Removalmentioning
confidence: 99%
“…Secondly, for all document restoration techniques, problems arise when trying to analyse results quantitatively, as there is no actual ground truth available. This problem may be overcome either by creating synthetic degraded images with known ground truth, [5], [16], or by creating synthetic ground truth data for given real degraded images, [2]. Alternatively, performance may be evaluated without any ground truth by quantifying how the restoration affects a secondary step, such as the performance of an Optical Character Recognition (OCR) system on the document image, [17], [16].…”
Section: Introductionmentioning
confidence: 99%
“…Information that is unimportant to OGR, like the color of the vertices (edges) and the background, is removed from the image. This can be achieved with any binarization algorithm like global, adaptive or hysteresis thresholding [6,7,9,13,14]. The extent of information that is filtered depends both on the drawing of the graph and the tasks of the subsequent phases of OGR.…”
Section: Preprocessingmentioning
confidence: 99%
“…In blind methods only one side of the document is used, whereas non-blind methods exploit accurately pre-registered recto and verso images of the document. Most of the earlier blind methods involve an intensity based thresholding step [2]. However, thresholding is not suitable when the aim is to preserve the original appearance of the document.…”
Section: Introductionmentioning
confidence: 99%