2021
DOI: 10.16995/dscn.8094
|View full text |Cite
|
Sign up to set email alerts
|

Advances and Limitations in Open Source Arabic-Script OCR: A Case Study

Abstract: This work presents an accuracy study of the open source OCR engine, Kraken, on the leading Arabic scholarly journal, al-Abhath. In contrast with other commercially available OCR engines, Kraken is shown to be capable of producing highly accurate Arabic-script OCR. The study also assesses the relative accuracy of typeface-specific and generalized models on the al-Abhath data and provides a microanalysis of the “error instances” and the contextual features that may have contributed to OCR misrecognition. Buildin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 3 publications
0
1
0
Order By: Relevance
“…Kiessling et al [104] discuss various open-source tools for Arabic OCR systems containing Tesseract (https://github.com/tesseract-ocr/tesseract, accessed on 24 February 2023), OCRad (https://www.gnu.org/software/ocrad/, accessed on 24 February 2023), and GOCR (https://jocr.sourceforge.net/, accessed on 24 February 2023). These tools are evaluated using an Arabic dataset, and Tesseract gives better recognition results.…”
Section: Discussionmentioning
confidence: 99%
“…Kiessling et al [104] discuss various open-source tools for Arabic OCR systems containing Tesseract (https://github.com/tesseract-ocr/tesseract, accessed on 24 February 2023), OCRad (https://www.gnu.org/software/ocrad/, accessed on 24 February 2023), and GOCR (https://jocr.sourceforge.net/, accessed on 24 February 2023). These tools are evaluated using an Arabic dataset, and Tesseract gives better recognition results.…”
Section: Discussionmentioning
confidence: 99%