2022
DOI: 10.21203/rs.3.rs-2273629/v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Transcription of Ottoman Machine-Print Documents

Abstract: With the ever increasing speed of the digitization process, a large collection of Ottoman documents is accessible to researchers and the general public. But, the majority of the users interested in these documents can not read these documents unless they are transcripted to the modern Turkish script which use an extended version of the Latin alphabet. Manual transcription of such a massive amount of documents is beyond the capacity of human experts. As a solution, we propose an automatic recognition system for… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 41 publications
(54 reference statements)
0
1
0
Order By: Relevance
“…In testing, the system achieved a 3.68% CER and a 16.61% Word Error Rate (WER) on a 1.4K line image test set, employing the Word Beam Search (WBS) decoder along with a 260K-word lexicon. The study demonstrated that using a recognition lexicon to restrict transcription output improves accuracy, but this approach is not optimal for the agglutinative nature of the Turkish language [12].…”
Section: Literature Reviewmentioning
confidence: 99%
“…In testing, the system achieved a 3.68% CER and a 16.61% Word Error Rate (WER) on a 1.4K line image test set, employing the Word Beam Search (WBS) decoder along with a 260K-word lexicon. The study demonstrated that using a recognition lexicon to restrict transcription output improves accuracy, but this approach is not optimal for the agglutinative nature of the Turkish language [12].…”
Section: Literature Reviewmentioning
confidence: 99%