2014 14th International Conference on Frontiers in Handwriting Recognition 2014
DOI: 10.1109/icfhr.2014.69
|View full text |Cite
|
Sign up to set email alerts
|

Automatic Handwritten Indian Scripts Identification

Abstract: Since OCR engines are usually script-dependent, automatic text recognition in multi-script document requires a pre-processor module that identifies the scripts. Based on this motivation, in this paper, we present a word level handwritten Indian script identification technique. To handle this, words are first segmented by morphological dilation and performed connected component labelling. We then employ the Radon transform, discrete wavelet transform, statistical filters and discrete cosine transform to extract… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 37 publications
(7 citation statements)
references
References 18 publications
0
7
0
Order By: Relevance
“…Major handwritten datasets include IAM [286], NIST [271], MNIST [277], CEDAR [261], RIMES [313], [314], UNIPEN , CENPARMI-Arabic [340] PE92 [410], etc. The datasets developed are mostly in languages like English like IAM, CEDAR, NIST, MNIST, IAM-OnDB, etc., Arabic AHDB, ARABASE, CENPARMI-A, LMCA, KHATT, CENPARMI-F etc., Chinese HCL2000, CASIA, SCUT-COUCH, etc., Indian languages like Bangla: BN-HTRd, Numerals DB, Devanagari DB, Multiscript Indian DB (Bangla, Devanagari, Tamil, Telugu) [217], Multiscript DB 11 scripts (Roman, Devanagari, Urdu, Kannada, Oriya, Gujarati, Bangla, Gurumukhi, Tamil, Telugu, Malayalam ) [404] The traditional tasks for DAR, supported by most datasets, are pre-processing, segmentation and recognition. Other tasks like DLA, word spotting, and forensic document analysis (WI and verification) have very few datasets concerning them.…”
Section: Handwritten Datasetsmentioning
confidence: 99%
“…Major handwritten datasets include IAM [286], NIST [271], MNIST [277], CEDAR [261], RIMES [313], [314], UNIPEN , CENPARMI-Arabic [340] PE92 [410], etc. The datasets developed are mostly in languages like English like IAM, CEDAR, NIST, MNIST, IAM-OnDB, etc., Arabic AHDB, ARABASE, CENPARMI-A, LMCA, KHATT, CENPARMI-F etc., Chinese HCL2000, CASIA, SCUT-COUCH, etc., Indian languages like Bangla: BN-HTRd, Numerals DB, Devanagari DB, Multiscript Indian DB (Bangla, Devanagari, Tamil, Telugu) [217], Multiscript DB 11 scripts (Roman, Devanagari, Urdu, Kannada, Oriya, Gujarati, Bangla, Gurumukhi, Tamil, Telugu, Malayalam ) [404] The traditional tasks for DAR, supported by most datasets, are pre-processing, segmentation and recognition. Other tasks like DLA, word spotting, and forensic document analysis (WI and verification) have very few datasets concerning them.…”
Section: Handwritten Datasetsmentioning
confidence: 99%
“…So, designing a handwritten numeral recognizer which can work in a multilingual environment will be of huge significance in the Indian context. A lot of research work (Chatterjee et al, 2019;Hangarge et al, 2013;Obaidullah et al, 2019;Pardeshi et al, 2014;Sahare et al, 2019;Singh et al, 2019;Singh, Sarkar, Bhateja, & Nasipuri, 2018;Singh, Sarkar, Nasipuri, & Doermann, 2015) has already been done for the recognition of characters (considering only text), but only a few work (Obaidullah et al, 2015;Obaidullah et al, 2016;Rakshit et al, 2019;Roy et al, 2004) has been extended to numerals. Motivated by the research gap, our approach aims to recognize handwritten numerals in four major scripts used in the Indian sub-continent.…”
Section: Existing Workmentioning
confidence: 99%
“…Circularity is the most important feature among the mathematical feature and the fractal-based feature is the predominant feature among the structure based feature in [9]. In 2014, Rajmohan Pardeshi uses the feature based on spatial information and multi-resolution in [10], DCT coefficients of first 10 are preferred and added together to generate total 20 of features for script identification. Sub-band coding of DWT, projection of the RT, 46 dimensions of feature vector yields from SFs are used to compute standard deviation and entropy.…”
Section: Indian Handwritten Script Identification Systemmentioning
confidence: 99%