First International Workshop on Document Image Analysis for Libraries, 2004. Proceedings.
DOI: 10.1109/dial.2004.1263243
|View full text |Cite
|
Sign up to set email alerts
|

Digital mountain:from granite archive to global access

Abstract: Large-scale, multi-terabyte digital libraries are becoming feasible due to decreasing costs of storage, CPU, and bandwidth. However, costs associated with preparing content for input into the library remain high due to the amount of human labor required. This paper describes the Digital Microfilm Pipeline-a sequence of image processing operations used to populate a large-scale digital library from a "mountain" of microfilm and reduce the human labor involved. Essential parts of the pipeline include algorithms … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
8
0

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(8 citation statements)
references
References 21 publications
0
8
0
Order By: Relevance
“…Much of the previous work in form processing assumes the availability of templates for form types of interest [4], [5], [6]. This assumption has been relaxed in later work [7], [8].…”
Section: B Form Processingmentioning
confidence: 99%
“…Much of the previous work in form processing assumes the availability of templates for form types of interest [4], [5], [6]. This assumption has been relaxed in later work [7], [8].…”
Section: B Form Processingmentioning
confidence: 99%
“…Continuous scanning is followed by automatic frame cropping as an efficient and fast procedure to generate images from microfilm [9]. Fourier-Mellin transform is used to correct rotation/shear, scale and translation errors [28].…”
Section: Dia Challenges In Historical DL Collectionsmentioning
confidence: 99%
“…Layout analysis and metadata extraction is a crucial step in creating an information base for historical DL's. Even as researchers are gaining ground on complete recognition of text content from historical documents (Subsection 2.2), practical systems have been built using only the layout analysis stage of DIA [9,26,35]. Availability of images makes it possible to provide content based image retrieval, using even structural features like color and layout.…”
Section: Dia Challenges In Historical DL Collectionsmentioning
confidence: 99%
“…On the other end of the spectrum, relatively modern printed documents do not suffer from significant substrate/ink degradation problems and lend themselves to a higher degree of automated processing, OCR (albeit not trivial) and more sophisticated content extraction and indexing [4]. Finally, there are specific applications such as the conversion of administrative documents, which are typically forms with fixed structure [1].…”
Section: Introductionmentioning
confidence: 99%