2015 Fifteenth International Conference on Advances in ICT for Emerging Regions (ICTer) 2015
DOI: 10.1109/icter.2015.7377678
|View full text |Cite
|
Sign up to set email alerts
|

Developing a commercial grade Tamil OCR for recognizing font and size independent text

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 10 publications
(7 citation statements)
references
References 7 publications
0
5
0
Order By: Relevance
“…Deals with three feature extraction techniques in order to grasp features from various Tamil characters possessing variations in style and shape Tamil text recognition by using KNN classifier [9] Overall 91%…”
Section: Approachmentioning
confidence: 99%
See 1 more Smart Citation
“…Deals with three feature extraction techniques in order to grasp features from various Tamil characters possessing variations in style and shape Tamil text recognition by using KNN classifier [9] Overall 91%…”
Section: Approachmentioning
confidence: 99%
“…The value for K in K-Means Clustering was taken as 25 since it was acceptable choice in terms of various factors. The overall recognition accuracy is 92.77% and 89.66% on the training set and test set respectively.Elakkiya et al[9] proposed an approach for Tamil text recognition by using KNN classifier. This involves a template creation stage where images of every letters are gathered and split into connected component images.…”
mentioning
confidence: 99%
“…In such work, a comparative study among the different classifiers has been conducted by concluding that the Hoeffding tree was the most performing one by reaching 73% accuracy. Another approach to support character recognition was based on the use of the Tesseract OCR open source engine to train a Tamil OCR model [47]. In particular, a segmentation approach was adopted by using a box file system.…”
Section: Optical Character Recognitionmentioning
confidence: 99%
“…It's well-known that Indic languages have many complexities and variations of characters which makes OCR systems hard to develop. But in the past few years, multiple studies have been conducted integrating Tesseract OCR engine for character recognition using different low resource languages such as Tamil [10], Hindi [11], Bengali [12] and Urdu [13].…”
Section: International Journal On Advances In Ict For Emerging Regionsmentioning
confidence: 99%