2006
DOI: 10.1007/11669487_37
|View full text |Cite
|
Sign up to set email alerts
|

Automatic Keyword Extraction from Historical Document Images

Abstract: Abstract. This paper presents an automatic keyword extraction method from historical document images. The proposed method is language independent because it is purely appearance based, where neither lexical information nor any other statistical language models are required. Moreover, since it does not need word segmentation, it can be applied to Eastern languages where they do not put clear spacing between words. The first half of the paper describes the algorithm to retrieve document image regions which have … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2006
2006
2011
2011

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 10 publications
0
3
0
Order By: Relevance
“…In (Rothfeder et al, 2003) word images are matched based on the corresponding interest points. The other studies on word spotting and retrieval include (Terasawa et al, 2006;Sankar and Jawahar, 2006;Llados et al, 2007).…”
Section: Related Workmentioning
confidence: 98%
“…In (Rothfeder et al, 2003) word images are matched based on the corresponding interest points. The other studies on word spotting and retrieval include (Terasawa et al, 2006;Sankar and Jawahar, 2006;Llados et al, 2007).…”
Section: Related Workmentioning
confidence: 98%
“…Fink and Plotz in (Fink & Plotz, 2005) have tested appearancebased features for writer independent handwritten text recognition and compared it with heuristic features. Terasawa et al in (Terasawa et al, 2006) have developed principal component analysis-based descriptors and gradient distribution features for word spotting in historical handwritten documents. The only limitation of the approach is its application to only well segmented threshold documents and very regular handwritten texts.…”
Section: Word Spottingmentioning
confidence: 99%
“…An occurrence of this word is found when a sub-image of the data set matches the query image. The features that are extracted from the word image can greatly differ from one matching method to another [29,30].…”
Section: Document Image Analysismentioning
confidence: 99%