2022
DOI: 10.31590/ejosat.1039846
|View full text |Cite
|
Sign up to set email alerts
|

Online Turkish Handwriting Recognition Using Synthetic Data

Abstract: We present a recognition system for online Turkish handwriting trained with synthetically generated data and transfer learning. Training deep networks requires large amounts of data. However, a sufficiently large collection of Turkish handwriting samples is not available. Hence we synthesize data to do pretraining before adapting the system to target dataset by fine tuning. We generate words from isolated character collection of a large English handwriting dataset. Then, we train the system first with syntheti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

1
1
0
1

Year Published

2023
2023
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 48 publications
1
1
0
1
Order By: Relevance
“…Taking an approach similar to the method proposed in this work, a CNN-BLSTM network, which is pre-trained with a synthetic dataset and fine-tuned with the ET train samples, achieved 12% CER and 44% WER on ET test set in [40]. Our results are approximately equal to those last results while we use a less complex method.…”
Section: Discussionsupporting
confidence: 59%
See 1 more Smart Citation
“…Taking an approach similar to the method proposed in this work, a CNN-BLSTM network, which is pre-trained with a synthetic dataset and fine-tuned with the ET train samples, achieved 12% CER and 44% WER on ET test set in [40]. Our results are approximately equal to those last results while we use a less complex method.…”
Section: Discussionsupporting
confidence: 59%
“…When the lexicon size is increased to 12,500, recognition accuracy is measured as 67.9%, using a bi-gram language model based on word stems and suffixes. In a recent study, a CNN-BLSTM network which is pre-trained with a synthetic dataset and fine-tuned with the Turkish dataset used in [24] achieved 88% character recognition accuracy in an open dictionary recognition task on that Turkish dataset [40].…”
Section: Related Work (İli̇şki̇li̇ çAlişmalar)mentioning
confidence: 99%
“…Спочатку більшість досліджень була зосереджена на розпізнаванні літер латинського алфавіту, але в останні роки увагу привертають й інші алфавітиарабський, російський, казахський, китайський, індійський тощо [6][7][8][9][10][11]. Для досліджень технологій розпізнавання рукописних літер латинського алфавіту фактичним стандартом є набір даних EMNIST [12].…”
Section: введенняunclassified