Proceedings Third IEEE Workshop on Applications of Computer Vision. WACV'96
DOI: 10.1109/acv.1996.572074
|View full text |Cite
|
Sign up to set email alerts
|

Document layout structure extraction using bounding boxes of different entitles

Abstract: This paper presents an eficient technique for document page layout structure extraction and classification by analyzing the spatial configuration of the bounding boxes of different entzties on the given image. The algorithm segments an image into a list of homogeneous zones. The classification algorithm labels each zone as text, table, line-drawing, halftone, ruling, or noise. The text-lines and words are extracted within text zones and neighboring text-lines are merged to form text-blocks. The tabular structu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
12
0

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 23 publications
(12 citation statements)
references
References 6 publications
0
12
0
Order By: Relevance
“…In this section, we briefly present a rule-based algorithm that extracts document layout structure using the bounding boxes of different entities [15]. Then we report the performance of each module on the images from the UW-III Document Image Database.…”
Section: Resultsmentioning
confidence: 99%
“…In this section, we briefly present a rule-based algorithm that extracts document layout structure using the bounding boxes of different entities [15]. Then we report the performance of each module on the images from the UW-III Document Image Database.…”
Section: Resultsmentioning
confidence: 99%
“…Since then, many other works dealing with high-level form representation, studying the structural relation among fields and often pursuing a completely automatic form analysis have been presented. Some of them, are essentially rule-based, as [7], [6], [16] and [18]. Other use graphs to establish relations among the fields, as [2] and [20].…”
Section: Related Workmentioning
confidence: 99%
“…These methods typically separate the original document into many different regions. Then use many filters to classify each region [5,18,20] (only one level of homogeneous region is used). In addition to creating many filters, these methods only effective when the region is not too complicated.…”
Section: Introductionmentioning
confidence: 99%