2022
DOI: 10.1007/s10032-022-00405-8
|View full text |Cite
|
Sign up to set email alerts
|

A survey of historical document image datasets

Abstract: This paper presents a systematic literature review of image datasets for document image analysis, focusing on historical documents, such as handwritten manuscripts and early prints. Finding appropriate datasets for historical document analysis is a crucial prerequisite to facilitate research using different machine learning algorithms. However, because of the very large variety of the actual data (e.g., scripts, tasks, dates, support systems, and amount of deterioration), the different formats for data and lab… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 22 publications
(7 citation statements)
references
References 169 publications
0
6
0
Order By: Relevance
“…LDM (Rombach et al 2022) proposes a cross-attention mechanism to incorporate the condition into the UNet and treats the diffusion process in the latent space. In text image generation, (Luhman and Luhman 2020;Gui et al 2023;Nikolaidou et al 2023) apply diffusion models to generate handwritten characters and demonstrate their promising effects. CTIG-DM (Zhu et al 2023) devises image, text, and style as conditions and introduces four text image generation modes in a diffusion model.…”
Section: Diffusion Modelmentioning
confidence: 99%
“…LDM (Rombach et al 2022) proposes a cross-attention mechanism to incorporate the condition into the UNet and treats the diffusion process in the latent space. In text image generation, (Luhman and Luhman 2020;Gui et al 2023;Nikolaidou et al 2023) apply diffusion models to generate handwritten characters and demonstrate their promising effects. CTIG-DM (Zhu et al 2023) devises image, text, and style as conditions and introduces four text image generation modes in a diffusion model.…”
Section: Diffusion Modelmentioning
confidence: 99%
“…While page segmentation approaches, such as [ 7 ], return text/non-text masks, the latter of which often includes visual elements, this remains insufficient for any meaningful historical study, as such approaches often lack accurate visual element localization, as well as semantic classification of these elements. In this regard, one of the main hurdles hindering the success of semantic visual element recognition within historical documents is their high variability, as well as the general scarcity of coherent historical datasets focused on visual element recognition, with only 11 out of the 56 historical document datasets mentioned in [ 15 ] containing graphical elements. Additionally, in the majority of cases where visual elements were recorded in these datasets, they were not classified according to their semantic classes [ 16 ].…”
Section: State Of the Artmentioning
confidence: 99%
“…In contrast, modern learning-based methods exhibit the ability to infer styled glyphs, even in cases where they have not been directly observed in the reference style examples. Despite a few attempts to perform HTG with Diffusion Models [33], [34], the most typical strategy is to leverage GANs, which can be unconditioned in the case of non-stylized HTG or conditioned on a variable number of handwriting style samples in the case of stylized HTG.…”
Section: Related Workmentioning
confidence: 99%
“…However, there is no agreement on the split to adopt, i.e., on which authors and relative images should be included in the training set and which in the test. As a result, some works adopt the standard HTR split [1], [3], [11], [25], [34], [47], [48] (commonly known as Aachen split), while others [2], [24], [26], [27], [29], [33], [49]- [51] consider the original split proposed with the IAM dataset, which entails a different distribution of the authors between training and test. Moreover, the text content of the generated words and the style samples considered for each author in styled-HTG are usually selected randomly, thus further hindering the fair comparison also between approaches adopting the same IAM splitting.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation