2015 13th International Conference on Document Analysis and Recognition (ICDAR) 2015
DOI: 10.1109/icdar.2015.7333877
|View full text |Cite
|
Sign up to set email alerts
|

Simplifying the reading of historical manuscripts

Abstract: Complex document layouts pose prominent challenges for document image understanding algorithms. These layouts impose irregularities on the location of text paragraphs which consequently induces difficulties in reading the text. In this paper we present a robust framework for analyzing historical manuscripts with complex layouts. This framework aims to provide a convenient reading experience for historians through topnotch algorithms for text localization, classification and dewarping. We segment text into spat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 21 publications
(10 citation statements)
references
References 29 publications
0
10
0
Order By: Relevance
“…Document layout segmentation using heuristic methods is classified mainly under three different categories: top-down, bottom-up and hybrid strategies. The bottom-up methods [2,31,34] used pixels as basic components, and performed operations like merging and grouping to form a larger homogeneous region. On the other hand, top-down methods [20,23,29] relied on splitting the whole document image iteratively into different regions, until a definite standard column or block was obtained.…”
Section: Heuristic Rule-based Document Layout Analysismentioning
confidence: 99%
“…Document layout segmentation using heuristic methods is classified mainly under three different categories: top-down, bottom-up and hybrid strategies. The bottom-up methods [2,31,34] used pixels as basic components, and performed operations like merging and grouping to form a larger homogeneous region. On the other hand, top-down methods [20,23,29] relied on splitting the whole document image iteratively into different regions, until a definite standard column or block was obtained.…”
Section: Heuristic Rule-based Document Layout Analysismentioning
confidence: 99%
“…Saabni and El-Sana [40] proposed a method that developed seem lines among text lines using energy maps. Asi et al [2] came up with a multi-scale texture-based algorithm for document images where Gabor filters were applied to locate different regions and a minimization energy function was applied to segment them. Despite successes of both top-down and bottom-up strategies, there are techniques [51] that have integrated both of them to segment regions in digital documents with complex layouts.…”
Section: Traditional Document Layout Segmentationmentioning
confidence: 99%
“…In [7], connected components are aggregated before vertical and horizontal white spaces are detected to produce a mask of areas of interest. Asi et al [5] manage to simplify the layout of historical documents by locating, segmenting, and dewarping text lines with severe curvature. These strategies have also been applied to segment text-lines.…”
Section: Hybrid Strategiesmentioning
confidence: 99%