Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track 2023
DOI: 10.18653/v1/2023.emnlp-industry.44
|View full text |Cite
|
Sign up to set email alerts
|

Gold Standard Bangla OCR Dataset: An In-Depth Look at Data Preprocessing and Annotation Processes

Hasmot Ali,
AKM Shahariar Azad Rabby,
Md Majedul Islam
et al.

Abstract: This research paper focuses on developing an improved Bangla Optical Character Recognition (OCR) system, addressing the challenges posed by the complexity of Bangla text structure, diverse handwriting styles, and the scarcity of comprehensive datasets. Leveraging recent advancements in Deep Learning and OCR techniques, we anticipate a significant enhancement in the performance of Bangla OCR by utilizing a large and diverse collection of labeled Bangla text image datasets. This study introduces the most extensi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 14 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?