2013 12th International Conference on Document Analysis and Recognition 2013
DOI: 10.1109/icdar.2013.139
|View full text |Cite
|
Sign up to set email alerts
|

Devanagari Text Recognition: A Transcription Based Formulation

Abstract: Abstract-Optical Character Recognition (OCR) problems are often formulated as isolated character (symbol) classification task followed by a post-classification stage (which contains modules like Unicode generation, error correction etc. ) to generate the textual representation, for most of the Indian scripts. Such approaches are prone to failures due to (i) difficulties in designing reliable word-to-symbol segmentation module that can robustly work in presence of degraded (cut/fused) images and (ii) converting… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
15
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 17 publications
(15 citation statements)
references
References 21 publications
0
15
0
Order By: Relevance
“…The system should be capable of learning these rules. While [7] demonstrated the system which recognizes Hindi (a language with relatively fewer rearranging rules), the present work shows the result on multiple Indic scripts, having much complicated rearranging rules. To accommodate such rules it is important for the classifier to analyze the feature segment based on its forward and backward information.…”
Section: Ocr: a Transcription Frameworkmentioning
confidence: 80%
See 3 more Smart Citations
“…The system should be capable of learning these rules. While [7] demonstrated the system which recognizes Hindi (a language with relatively fewer rearranging rules), the present work shows the result on multiple Indic scripts, having much complicated rearranging rules. To accommodate such rules it is important for the classifier to analyze the feature segment based on its forward and backward information.…”
Section: Ocr: a Transcription Frameworkmentioning
confidence: 80%
“…More over, these conversions are often brittle and can fail with noisy symbol labels. In our previous work [7], we have proposed a solution to this problem by considering it as a sequence to sequence transcription, where we convert the sequence of word features into the corresponding text sequence. Such an approach does not mandate one to identify the UNICODE rearranging rules in advance.…”
Section: Ocr: a Transcription Frameworkmentioning
confidence: 99%
See 2 more Smart Citations
“…Some work has also been done in the last few years on online handwritten recognition of Telugu script using HMM [23] and online handwritten Tamil word recognition [24] have used segmentation based approaches. Naveen et al presented a direct implementation of single layer LSTM network for the recognition of Devanagiri scripts [25], [26] and further experimented on more Indic scripts [27]. In this paper we present a Deep BLSTM based RNN architecture that is script independent, segmentation free and does not have any unicode re-ordering issues.…”
Section: Related Workmentioning
confidence: 98%