Cluster-Based Sample Selection for Document Image Binarization

Krantz, Amandus; Westphal, Florian

doi:10.1109/icdarw.2019.40080

Cited by 6 publications

(4 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…GiB [65]: Extracts features for clustering using game theory Shallow ML Kasmin [66]: Ensemble of 8 SVMs #: Hamza [67], Rabelo [68], Pastor [69] Deep Learning FCNs: Pastor [70], Calvo-Zaragoza [71],PDNet [72], Morphological networks with grayscale dilation and erosion operations: Mondal [73], U-Net architecture with attention layers: Kang [74]; GANs: Tensmeyer [75], Zhao [76]; LSTMs: Afzal [77], Westphal [78] LSTMs + Deep learning methods outperform a non-convolutional deep MLP (including histogram and median filter features; − Less availability of labelled data for training; Recurrent networks (LSTM, BI-LSTM) not as effective as FCN; # Bhunia [79], Krantz [80] Parameter tuning Supervised tuning: Xiong [81], Messaoud [82]; Unsupervised tuning: Ntirogiannis [83], Liang [84], + Finds best parameter settings for algorithms, − Output of algorithm dependant on these parameter settings + Pros, − Cons, # Related research layout analysis procedure, as the document types are varied. With this information, there are two levels of segmentation.…”

Section: Game Theorymentioning

confidence: 99%

Document Analysis and Recognition: A survey

NIGAM¹,

Verma²,

Nagabhushan³

2023

Preprint

View full text Add to dashboard Cite

<p>The journey of research for Document Analysis and Recognition (DAR) started with the problem of automatic character recognition. Today, it has covered a vast span of recognition tasks such as text recognition, script identification, writer identification, word spotting, signature verification etc., in scripts like Roman, Arabic, Chinese, Japanese, Brahmi etc. Extensive advancements in deep learning techniques have achieved state-of-the-art results for various DAR tasks. This work explores the challenges from different perspectives and reviews the techniques for online/offline and printed/handwritten DAR tasks. We examine the research works with the view of script-related challenges. Various datasets for DAR are categorized into historic, printed and handwritten datasets. We present a comprehensive survey of challenges, techniques, datasets, evaluation metrics, script-related perspectives and potential future directions for DAR.</p>

show abstract

Section: Game Theorymentioning

confidence: 99%

Document Analysis and Recognition: A survey

NIGAM¹,

Verma²,

Nagabhushan³

2023

Preprint

View full text Add to dashboard Cite

show abstract

“…Bhunia et al [14] proposed a unique approach to train a binarization network using unpaired training data (i.e., the grayscale and binary images do not correspond) and achieved an impressive 97.8% FM on DIBCO13. To reduce the amount of needed training data, Krantz and Westphal [68] proposed a clustering method to ensure only diverse data are labeled. This reduces dataset size by 50% at a modest loss in accuracy.…”

Section: Deep Neural Networkmentioning

confidence: 99%

“…There is also limited labeled training data for deep models. Though some initial explorations around this issue have been made [14,55,68,165], there is still a large gap to address.…”

Section: Technical Challengesmentioning

confidence: 99%

Historical Document Image Binarization: A Review

Tensmeyer

Martinez

2020

SN COMPUT. SCI.

View full text Add to dashboard Cite

This review provides a comprehensive view of the field of historical document image binarization with a focus on the contributions made in the last decade. After the introduction of a standard benchmark dataset with the 2009 Document Image Binarization Contest, research in the field accelerated. Besides the standard methods for image thresholding, preprocessing, and post-processing, we review the literature on methods such as statistical models, pixel classification with learning algorithms, and parameter tuning. In addition to reviewing binarization algorithms, we discuss the available public datasets and evaluation metrics, including those that require pixel-level ground truth and those that do not. We conclude with recommendations for future work.

show abstract

“…However, one drawback of this approach is that it relies on label information in the original training set. Krantz and Westphal [13] follow a similar strategy for selecting training samples for image binarization using a relative neighborhood graph. In contrast to Rayar et al, their approach avoids the need for a labeled set to choose from by creating pseudo labels through clustering.…”

Section: Related Workmentioning

confidence: 99%

Representative Image Selection for Data Efficient Word Spotting

Westphal

Grahn

Lavesson

2020

Lecture Notes in Computer Science

Self Cite

View full text Add to dashboard Cite

This paper compares three different word image representations as base for label free sample selection for word spotting in historical handwritten documents. These representations are a temporal pyramid representation based on pixel counts, a graph based representation, and a pyramidal histogram of characters (PHOC) representation predicted by a PHOCNet trained on synthetic data. We show that the PHOC representation can help to reduce the amount of required training samples by up to 69% depending on the dataset, if it is learned iteratively in an active learning like fashion. While this works for larger datasets containing about 1 700 images, for smaller datasets with 100 images, we find that the temporal pyramid and the graph representation perform better.

show abstract

Cluster-Based Sample Selection for Document Image Binarization

Cited by 6 publications

References 34 publications

Document Analysis and Recognition: A survey

Document Analysis and Recognition: A survey

Historical Document Image Binarization: A Review

Representative Image Selection for Data Efficient Word Spotting

Contact Info

Product

Resources

About