Proceedings of the 9th IAPR International Workshop on Document Analysis Systems 2010
DOI: 10.1145/1815330.1815356
|View full text |Cite
|
Sign up to set email alerts
|

Analysis of whole-book recognition

Abstract: Whole-book recognition is a document image analysis strategy that operates on the complete set of a book's page images, attempting to improve accuracy by automatic unsupervised adaptation. Our algorithm expects to be given initial iconic and linguistic models-derived from (generally errorful) OCR results and (generally incomplete) dictionariesand then, guided entirely by evidence internal to the test set, the algorithm corrects the models yielding improved accuracy. We have found that successful corrections ar… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2010
2010
2012
2012

Publication Types

Select...
4
1

Relationship

3
2

Authors

Journals

citations
Cited by 5 publications
(10 citation statements)
references
References 10 publications
0
10
0
Order By: Relevance
“…iconic and linguistic models) may get different recognition results, among which disagreements can be measured. Formal analysis in [6] shows that disagreement's statistical properties can be leveraged to improve recognition rates when processing long isogenous book images.…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…iconic and linguistic models) may get different recognition results, among which disagreements can be measured. Formal analysis in [6] shows that disagreement's statistical properties can be leveraged to improve recognition rates when processing long isogenous book images.…”
Section: Discussionmentioning
confidence: 99%
“…However, if we collect contextual predictions for a number of this character's occurrences in the passage and look for a "consensus" among them 2 , this resulting hypothesis is more likely to be correct because contextual predictions on one character coming from different words are in a sense independent. If we adapt the iconic model on this character to achieve a smallest possible whole-passage disagreement, the top hypothesis of this character will then correspond to the consensus of its contextual predictions from the whole passage, and its accuracy is determined by that of this consensus too [6]. This is the principal reason why iconic model adaptation driven by whole-passage disagreement leads to better recognition results.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…Pingping Xiu and I, believing that model adaptation operating on long passages might shed light on these questions, explored "whole-book" recognition [57], a strategy that operates on the complete set of a book's page images using automatic adaptation to improve accuracy.…”
Section: B Mutually Correcting Modelsmentioning
confidence: 99%