2008
DOI: 10.1007/978-3-540-76280-5_1
|View full text |Cite
|
Sign up to set email alerts
|

Introduction to Document Analysis and Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
38
0
1

Year Published

2009
2009
2019
2019

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 54 publications
(39 citation statements)
references
References 24 publications
0
38
0
1
Order By: Relevance
“…Unfortunately, publicly available data sets for document type classification (Marinai, 2008) do not resemble the type of administrative documents that we work with. Therefore we use a data set of real life dossiers acquired from a Dutch anonymous company that provides consumer loans.…”
Section: Datasetmentioning
confidence: 99%
See 2 more Smart Citations
“…Unfortunately, publicly available data sets for document type classification (Marinai, 2008) do not resemble the type of administrative documents that we work with. Therefore we use a data set of real life dossiers acquired from a Dutch anonymous company that provides consumer loans.…”
Section: Datasetmentioning
confidence: 99%
“…Automatic document analysis is the field that deals with the different steps in the document processing pipeline, from the incoming documents to document specific treatment. A mayor part of this field focuses on digital image analysis (Marinai, 2008). Here we focus on the problems occurring at the start of the document processing pipeline and instead of doing image analysis, we concentrate on textual content analysis of OCR-ed versions of the scanned documents.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…In successful DIA systems, the size of the training sets used to develop and tune the system is necessarily far smaller than the volume of data processed over the lifetime of the system. We should apply learning algorithms that use routine feedback from the operator to improve classification [30,31]. Furthermore, reduction of the cost of storage has led to the maintenance of complete records -even of systems as large as the whole web.…”
Section: Green Interactionmentioning
confidence: 99%
“…Research on Devanagari [1] character [2][3][4][5][6][7][8][9][10] and word [11][12][13] recognition is very difficult due to its challenging properties. This area of research is still open for further research due to the extent of variation among writing styles, speed, thickness of character and direction of different writers.…”
Section: Introductionmentioning
confidence: 99%