2006
DOI: 10.21236/ada460371
|View full text |Cite
|
Sign up to set email alerts
|

Script-Independent Text Line Segmentation in Freestyle Handwritten Documents

Abstract: Text line segmentation in freestyle handwritten documents remains an open document analysis problem. Curvilinear text lines and small gaps between neighboring text lines present a challenge to algorithms developed for machine printed or hand-printed documents. In this paper, we propose a novel approach based on density estimation and a state-of-the-art image segmentation technique, the level set method. From an input document image, we estimate a probability map, where each element represents the probability t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
34
0

Year Published

2009
2009
2016
2016

Publication Types

Select...
3
1
1

Relationship

1
4

Authors

Journals

citations
Cited by 13 publications
(34 citation statements)
references
References 15 publications
0
34
0
Order By: Relevance
“…First, they are based on the assumption of uniformity among printed text, and require precise baseline alignment and word segmentation. Freestyle handwritten text lines are curvilinear, and in general, there are no well-defined baselines, even by linear or piecewise-linear approximation [4]. Second, it is difficult to extend these methods to a new language, because they employ a combination of hand-picked and trainable features and a variety of decision rules.…”
Section: Language Identificationmentioning
confidence: 99%
See 3 more Smart Citations
“…First, they are based on the assumption of uniformity among printed text, and require precise baseline alignment and word segmentation. Freestyle handwritten text lines are curvilinear, and in general, there are no well-defined baselines, even by linear or piecewise-linear approximation [4]. Second, it is difficult to extend these methods to a new language, because they employ a combination of hand-picked and trainable features and a variety of decision rules.…”
Section: Language Identificationmentioning
confidence: 99%
“…We use 1512 document images of eight languages (Arabic, Chinese, English, Hindi, Japanese, Korean, Russian, and Thai) from the University of Maryland multilingual database [4] and IAM handwriting DB3.0 database [5] for evaluation on language identification (see Fig. 1).…”
Section: Datasetmentioning
confidence: 99%
See 2 more Smart Citations
“…Correctness/incorrectness of text-line segmentation directly affects the accuracy of word/character segmentation, which consequently changes the accuracy of word/character recognition [10]. Several techniques for text-line segmentation are reported in the literature [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18]. These techniques may be categorized into four groups [10,13] as follows: (i) projection profile based techniques, (ii) Hough transform based techniques, (iii) smearing techniques and (iv) methods based on thinning operations.…”
Section: Introductionmentioning
confidence: 99%