2022
DOI: 10.3390/app12084083
|View full text |Cite
|
Sign up to set email alerts
|

Burapha-TH: A Multi-Purpose Character, Digit, and Syllable Handwriting Dataset

Abstract: In handwriting recognition research, a public image dataset is necessary to evaluate algorithm correctness and runtime performance. Unfortunately, in existing Thai language script image datasets, there is a lack of variety of standard handwriting types. This paper focuses on a new offline Thai handwriting image dataset named Burapha-TH. The dataset has 68 character classes, 10 digit classes, and 320 syllable classes. For constructing the dataset, 1072 Thai native speakers wrote on collection datasheets that we… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 17 publications
0
2
0
Order By: Relevance
“…The HOGfoDRs+SVM method achieved 98.76% with 5-fold cross-validation. For the updated Thai handwritten dataset, Onuean et al [27] collected Thai handwritten characters, called Burapha-TH, that consisted of 10 digits, 68 characters, and 320 syllable classes. They also created a CNN model using a VGG architecture with a batch normalization layer containing 13 layers, called VGG-13, evaluated on the Burapha-TH dataset, and which achieved 92.29%, 95.00%, and 96.16% accuracy on the digit, character, and syllable classes, respectively.…”
Section: A Handwritten Character Recognitionmentioning
confidence: 99%
“…The HOGfoDRs+SVM method achieved 98.76% with 5-fold cross-validation. For the updated Thai handwritten dataset, Onuean et al [27] collected Thai handwritten characters, called Burapha-TH, that consisted of 10 digits, 68 characters, and 320 syllable classes. They also created a CNN model using a VGG architecture with a batch normalization layer containing 13 layers, called VGG-13, evaluated on the Burapha-TH dataset, and which achieved 92.29%, 95.00%, and 96.16% accuracy on the digit, character, and syllable classes, respectively.…”
Section: A Handwritten Character Recognitionmentioning
confidence: 99%
“…Devanagari Optical Character Recognition is one such area that has seen many investigations. There are some publicly available benchmarks handwritten character databases available for scripts like Odia [3], Arabic [4] [5], Malayalam [6] [7], Meitei Mayek [8], Farsi [9], Telugu [10], Urdu [11], Tifinagh [12], Thai [13] and MNIST [14] datasets. However, no research has been carried out on character recognition of the Ranjana script.…”
Section: Introductionmentioning
confidence: 99%