The Thrity-Seventh Asilomar Conference on Signals, Systems &Amp; Computers, 2003
DOI: 10.1109/acssc.2003.1291873
|View full text |Cite
|
Sign up to set email alerts
|

Conversion of PDF documents into HTML: a case study of document image analysis

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
11
0

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 21 publications
(11 citation statements)
references
References 0 publications
0
11
0
Order By: Relevance
“…We again use the method of character projection to detect the columns, including number of columns, column position in a page. In contrast to [5], we do not assume only vertical columns.…”
Section: Detection Of Columnsmentioning
confidence: 96%
See 1 more Smart Citation
“…We again use the method of character projection to detect the columns, including number of columns, column position in a page. In contrast to [5], we do not assume only vertical columns.…”
Section: Detection Of Columnsmentioning
confidence: 96%
“…In an earlier paper [4] we described a method of detecting page body areas by body text font expansion and header-footer elimination. Rahman et al [5] also discussed how to find columns at the page level using white space rectangles. Saitoh et al [6] presented a system of document image segmentation and text block ordering, including detection of text line direction as well as header and footer.…”
Section: Introductionmentioning
confidence: 98%
“…Section matches were then extracted and evaluated using sensitivity, specificity, and F1‐score in addition to being reviewed by a domain expert for accuracy. Additional related work includes PDF to HTML text detection approaches that maintain layout and font information (Jiang & Yang, ), table detection, extraction and annotation (Khusro, Latif, & Ullah, ) and analysis using white spaces (Rahman & Alam, ).…”
Section: Related Workmentioning
confidence: 99%
“…For instance, researches about recovering logical structures from electronic documents all target at customized restructuring tasks for very specific sets of documents [1,2,3,4].…”
Section: Introductionmentioning
confidence: 99%