1992
DOI: 10.1109/5.156473
|View full text |Cite
|
Sign up to set email alerts
|

Document analysis-from pixels to contents

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
23
0
1

Year Published

1998
1998
2014
2014

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 65 publications
(25 citation statements)
references
References 6 publications
1
23
0
1
Order By: Relevance
“…At the research level, this has been pursued in domains such as technical papers, business letters and chemical structure diagrams (Tsujimoto & Asada, 1992;Schürmann et al, 1992;Nagy, Seth, et al, 1985). Some commercial OCR systems provide limited inverse formatting, producing codes for elementary structures such as paragraphs, columns, and tables (Bokser, 1992).…”
Section: Text Documentsmentioning
confidence: 99%
See 1 more Smart Citation
“…At the research level, this has been pursued in domains such as technical papers, business letters and chemical structure diagrams (Tsujimoto & Asada, 1992;Schürmann et al, 1992;Nagy, Seth, et al, 1985). Some commercial OCR systems provide limited inverse formatting, producing codes for elementary structures such as paragraphs, columns, and tables (Bokser, 1992).…”
Section: Text Documentsmentioning
confidence: 99%
“…The classifier could store a single prototype per character. Schurmann, Bartneck, et al (1992) applies normalizing transformations to reduce certain well-defined variations as far as possible. The inevitably remaining variations are left for learning by statistical adaptation of the classifier.…”
Section: Character Recognition Feature Extractionmentioning
confidence: 99%
“…2 (a), the 'head' part is defined as the region from the leftmost position to a node point (denoted by 'N') where the value of horizontal histogram is larger than W β . Here, W is the average stroke thickness in the entire image calculated using a simple mathematical method proposed in [5], and β is empirically determined as 1.25, considering some possible noise or distortion on a stroke. Similarly, the 'tail' part is defined as the region from the rightmost position to the node point (denoted by 'M').…”
Section: Modified Distance Measuresmentioning
confidence: 99%
“…The PLA is used as a pre-processing tool that decomposes the mixed-type documents into their main regions. The goal of the PLA is to discover formatting of the text and, from that, to derive meaning associated with the positional and functional blocks in which the text is located (Chauvet et al, 1992;Fujisawa et al, 1992;Schurmann et al, 1992;Witten et al 1994). A PLA method consists of two main steps.…”
Section: Document Multithresholdingmentioning
confidence: 99%