Handwritten document image dataset development is one of the most tedious and time consuming tasks in optical character recogniser (OCR) related experimental work. Special attention need to be given in terms of feasibility, realness, clarity etc. while collecting real life data from different writers. Few efforts can be found in the literature for development of handwritten NIdb (numeral image dataset) but they were restricted on single script which is a local script of the fellow researcher who prepared the database. In this paper, an approach to develop word-level handwritten NIdb of four popular Indic scripts namely Bangla, Devanagari, Roman and Urdu has been proposed. Benchmark result is developed with respect to handwritten numeral script identification (HNSI) problem by applying a novel image transform fusion (ITF) based technique. The proposed dataset will be freely available to the researchers for non-commercial use.
S.M. Obaidullah et al.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.