Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition
DOI: 10.1109/iwfhr.2002.1030917
|View full text |Cite
|
Sign up to set email alerts
|

Separating text and background in degraded document images - a comparison of global thresholding techniques for multi-stage thresholding

Abstract: Before any processing of the textual content of a document image can be performed the text must be separated from the background of the image. Several thresholding algorithms have previously been proposed and are widely used in document processing. None have been shown effective at thresholding difficult documents where the background and foreground are non-uniform. In this paper we investigate the use of three global thresholding algorithms (Otsu's, Kapur's entropy and Solihin's quadratic integral ratio (QIR)… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
23
0
1

Publication Types

Select...
5
4
1

Relationship

1
9

Authors

Journals

citations
Cited by 63 publications
(24 citation statements)
references
References 8 publications
0
23
0
1
Order By: Relevance
“…One direction is based on binarization techniques such as the multi-stage binarization method proposed by Bar-Yosef et al to restore and recognize ancient Hebrew calligraphy documents [15]. Leedham et al [16] have investigated three global thresholding algorithms and a multi-stage thresholding algorithm to separate text from background in degraded document images and concluded that the given global algorithms do not work well with difficult documents due to over-thresholding while the multi-stage algorithm can do a better job by incrementally remove the noise. Another direction is to separate different layers using classification techniques especially in addressing the bleedthrough problem.…”
Section: Previous Workmentioning
confidence: 99%
“…One direction is based on binarization techniques such as the multi-stage binarization method proposed by Bar-Yosef et al to restore and recognize ancient Hebrew calligraphy documents [15]. Leedham et al [16] have investigated three global thresholding algorithms and a multi-stage thresholding algorithm to separate text from background in degraded document images and concluded that the given global algorithms do not work well with difficult documents due to over-thresholding while the multi-stage algorithm can do a better job by incrementally remove the noise. Another direction is to separate different layers using classification techniques especially in addressing the bleedthrough problem.…”
Section: Previous Workmentioning
confidence: 99%
“…This function is currently realized using a thresholding technique which we have found to be sufficient for our purposes. For more sophisticated foreground/background separation, see [11]. Using is ink, the upper and lower word profiles can be calculated as follows:…”
Section: Word Profilesmentioning
confidence: 99%
“…The problem of restoring a document image suffering from bleedthrough degradation, which is a major task in the analysis of very old documents, has been studied from several points of view [1][2][3][4][5][6][7][8][9]. The methods can be categorized in two groups: methods which work on double-sided document images, and methods which work on singlesided document images.…”
Section: Introductionmentioning
confidence: 99%