2006
DOI: 10.1007/11669487_23
|View full text |Cite
|
Sign up to set email alerts
|

Script Identification from Indian Documents

Abstract: Abstract. Automatic identification of a script in a given document image facilitates many important applications such as automatic archiving of multilingual documents, searching online archives of document images and for the selection of script specific OCR in a multilingual environment. In this paper, we present a scheme to identify different Indian scripts from a document image. This scheme employs hierarchical classification which uses features consistent with human perception. Such features are extracted f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
20
0

Year Published

2010
2010
2019
2019

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 60 publications
(20 citation statements)
references
References 11 publications
0
20
0
Order By: Relevance
“…Dhanya et al [14], has achieved better performance in the latter method rather than the Pati et al [24]. But this would fail for some English characters which are descender dominant.…”
Section: Related Workmentioning
confidence: 96%
See 2 more Smart Citations
“…Dhanya et al [14], has achieved better performance in the latter method rather than the Pati et al [24]. But this would fail for some English characters which are descender dominant.…”
Section: Related Workmentioning
confidence: 96%
“…Dhanya et al [9], has performed script classification using Gabor filters for bilingual document images in frequency domain. In the work of Joshi et al [14], different Indian scripts are identified with a Log Gabor filter. Regarding token based approach Judith Hochberg et al [8], [9] exemplars are grouped into clusters based on a similarity measure.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The local approaches(Pal and Chaudhury [2], Pal et al [3]) analyze a list of connected components (Line, word, char) in the document images, to identify the script(or class of script). In contrast, global approaches (Joshi [4]) employ an analysis of regions (block of text) comprising atleast two lines (or words)without finer segmentation. In general, global approaches work well based on texture measurement, but this relies heavily on a uniform block of text (Buschet al [5]), and extensive preprocessing (to make the text block uniform) is required to measure the texture.…”
Section: B Script Recognitionmentioning
confidence: 99%
“…In Europe and in countries like India, where there are multiple official languages with independent scripts, the automation of language recognition is of utmost importance for proper management of digitized documents. Automatic script identification is useful in sorting document images, selecting appropriate script-specific OCRs, and search online archives of document image for those containing a particular script Joshi et al (2006). With the high amount of digitization happening all over which is not just restricted to English documents, a language identification system capable of identifying both Asian and European languages is important.…”
Section: Introductionmentioning
confidence: 99%