2011 International Conference on Document Analysis and Recognition 2011
DOI: 10.1109/icdar.2011.280
|View full text |Cite
|
Sign up to set email alerts
|

Connected Component Level Discrimination of Handwritten and Machine-Printed Text Using Eigenfaces

Abstract: We employ Eigenfaces to discriminate between handwritten and machine-printed text at the connected component (CC) level. Normalized images of machine print CCs are treated as points in a high-dimensional space. PCA yields a reduceddimensional character space. Representative machine print CCs are projected into character space and a local distance threshold for each representative is automatically determined. CCs are classified as machine print if they are within the local distance threshold of their closest ma… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2012
2012
2021
2021

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(7 citation statements)
references
References 14 publications
0
7
0
Order By: Relevance
“…The variations of machine-printed samples are significantly lower than those of the handwritten class. As a result, the feature space extracted from the machine-printed samples are more concentrated, while the same features of the handwritten text samples are mapped to a significantly wider range [2]. This fact has been employed by many of the previous approaches to assign classification boundaries between the two classes, which provided acceptable results over particular types of documents.…”
Section: Introductionmentioning
confidence: 82%
See 1 more Smart Citation
“…The variations of machine-printed samples are significantly lower than those of the handwritten class. As a result, the feature space extracted from the machine-printed samples are more concentrated, while the same features of the handwritten text samples are mapped to a significantly wider range [2]. This fact has been employed by many of the previous approaches to assign classification boundaries between the two classes, which provided acceptable results over particular types of documents.…”
Section: Introductionmentioning
confidence: 82%
“…An Eigenface-based approach has been proposed for the HMC task in [2]. As in [15], first, principal component analysis (PCA) is used to create a set of normalised characters in different font styles.…”
Section: Literature Reviewmentioning
confidence: 99%
“…The classification of single characters is also possible. An interesting approach is proposed by Pinson et al [8], i.e., the use of "eigenfaces" adapted to text, which lead to a mean recall of 91.5%. Koyama et al [9] presented a method based on frequency domain analysis can reach a precision of 97% on different kinds of characters.…”
Section: Related Workmentioning
confidence: 99%
“…Handwriting may need to be extracted from forms and processed independently whereas machine-print-connected components (CC) are passed to an optical-character-recognition engine. 28 Registration forms are special documents that can be viewed as consisting of layers (letterhead, content, signatures, handwritings, table line, stamper, noise, etc.). Document analysis segments a registration-form document into layers with di®erent physical and semantic properties.…”
Section: Introductionmentioning
confidence: 99%