2016
DOI: 10.3390/s16030346
|View full text |Cite
|
Sign up to set email alerts
|

Synthesis of Common Arabic Handwritings to Aid Optical Character Recognition Research

Abstract: Document analysis tasks such as pattern recognition, word spotting or segmentation, require comprehensive databases for training and validation. Not only variations in writing style but also the used list of words is of importance in the case that training samples should reflect the input of a specific area of application. However, generation of training samples is expensive in the sense of manpower and time, particularly if complete text pages including complex ground truth are required. This is why there is … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 40 publications
0
5
0
Order By: Relevance
“…Pattern recognition includes techniques such as speech recognition, which recognizes and extracts human voices from voice data and interprets them as commands in optical character recognition (OCR), where characters from image data are recognized and converted into text data and a full-text search system, which recognizes specific keywords and searches documents from a large amount of document information [22,23].…”
Section: Pattern Recognitionmentioning
confidence: 99%
“…Pattern recognition includes techniques such as speech recognition, which recognizes and extracts human voices from voice data and interprets them as commands in optical character recognition (OCR), where characters from image data are recognized and converted into text data and a full-text search system, which recognizes specific keywords and searches documents from a large amount of document information [22,23].…”
Section: Pattern Recognitionmentioning
confidence: 99%
“…In [42], the authors proposed a system synthesizing Arabic handwritten words and text pages to generate comprehensive databases for training and validating OCR systems. In the database, vocabulary of the 50,000 most-common Arabic words are used for error correction.…”
Section: Datasetmentioning
confidence: 99%
“…en, backpropagation neural network, statistical pattern recognition, and support vector machine (SVM) methods [13] are used to recognize the characters or strokes. Finally, the individual characters or strokes are combined into characters according to specific rules.…”
Section: Manchu Language Ocrmentioning
confidence: 99%