2018
DOI: 10.1016/j.ipm.2018.06.001
|View full text |Cite
|
Sign up to set email alerts
|

Statistical learning for OCR error correction

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
41
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 34 publications
(41 citation statements)
references
References 27 publications
0
41
0
Order By: Relevance
“…Some prior approaches utilized frequencies of word and word ngrams to detect errors. A token is viewed as an error if its frequency or its ngram frequencies are less than a threshold [6]. Similarly, Khirbat [7] combined ngram frequencies with a presence of non alphanumeric text within a token to classify whether the token is an error or not.…”
Section: Mixed Error Detectionmentioning
confidence: 99%
“…Some prior approaches utilized frequencies of word and word ngrams to detect errors. A token is viewed as an error if its frequency or its ngram frequencies are less than a threshold [6]. Similarly, Khirbat [7] combined ngram frequencies with a presence of non alphanumeric text within a token to classify whether the token is an error or not.…”
Section: Mixed Error Detectionmentioning
confidence: 99%
“…Most of the techniques of this type rely on noisy channel and language model [1,15,27]. The others explore several machine learning techniques to suggest correct candidates [2,10,16].…”
Section: Ocr Post-processing Approachesmentioning
confidence: 99%
“…Other approaches [10,16] explored different sources to generate candidates and then ranked them using a regression model. Several features were extracted such as confusion probability, uni-gram frequency, context feature, term frequency in the OCR text, word confidence, and string similarity.…”
Section: Ocr Post-processing Approachesmentioning
confidence: 99%
See 2 more Smart Citations