A selectional auto-encoder approach for document image binarization

Calvo-Zaragoza, Jorge; Gallego, Antonio‐Javier

doi:10.1016/j.patcog.2018.08.011

Cited by 139 publications

(122 citation statements)

References 34 publications

Supporting

Mentioning

122

Contrasting

Order By: Relevance

“…In summary, the differences of the proposed method with the existed works [16,20,14,21] are summarized as follows. (1) Unlike the previous methods which train the neural network to learn the labels of each pixel, the output of our method is the latent uniform version of the input images, which represents an internally enhanced version of the image.…”

Section: Introductionmentioning

confidence: 99%

DeepOtsu: Document enhancement and binarization using iterative deep learning

Schomaker

2019

Pattern Recognition

127

View full text Add to dashboard Cite

This paper presents a novel iterative deep learning framework and apply it for document enhancement and binarization. Unlike the traditional methods which predict the binary label of each pixel on the input image, we train the neural network to learn the degradations in document images and produce the uniform images of the degraded input images, which allows the network to refine the output iteratively. Two different iterative methods have been studied in this paper: recurrent refinement (RR) which uses the same trained neural network in each iteration for document enhancement and stacked refinement (SR) which uses a stack of different neural networks for iterative output refinement. Given the learned uniform and enhanced image, the binarization map can be easy to obtain by a global or local threshold. The experimental results on several public benchmark data sets show that our proposed methods provide a new clean version of the degraded image which is suitable for visualization and promising results of binarization using the global Otsu's threshold based on the enhanced images learned iteratively by the neural network.

show abstract

Section: Introductionmentioning

confidence: 99%

DeepOtsu: Document enhancement and binarization using iterative deep learning

Schomaker

2019

Pattern Recognition

127

View full text Add to dashboard Cite

show abstract

“…In the work of Vo et al [17], a Markov Random Field is used to binarize the documents from a foreground modeling based on the color of the staff lines. However, the main problem found in these options is the varying performance depending on the characteristics of the document [18,19].…”

Section: Introductionmentioning

confidence: 99%

Deep Neural Networks for Document Processing of Music Score Images

et al. 2018

Self Cite

View full text Add to dashboard Cite

There is an increasing interest in the automatic digitization of medieval music documents. Despite efforts in this field, the detection of the different layers of information on these documents still poses difficulties. The use of Deep Neural Networks techniques has reported outstanding results in many areas related to computer vision. Consequently, in this paper, we study the so-called Convolutional Neural Networks (CNN) for performing the automatic document processing of music score images. This process is focused on layering the image into its constituent parts (namely, background, staff lines, music notes, and text) by training a classifier with examples of these parts. A comprehensive experimentation in terms of the configuration of the networks was carried out, which illustrates interesting results as regards to both the efficiency and effectiveness of these models. In addition, a cross-manuscript adaptation experiment was presented in which the networks are evaluated on a different manuscript from the one they were trained. The results suggest that the CNN is capable of adapting its knowledge, and so starting from a pre-trained CNN reduces (or eliminates) the need for new labeled data.

show abstract

“…In Table 3 we show the measurements of previously trained network for the H-DIBCO'18 dataset. We have to notice that it outperformed all participants of the H-DIBCO'18 on the target dataset [37]. Moreover, the organizers also have published results of proposed methods obtained for DIBCO'17 dataset in [37] in Table II.…”

Section: Resultsmentioning

confidence: 91%

“…We have to notice that it outperformed all participants of the H-DIBCO'18 on the target dataset [37]. Moreover, the organizers also have published results of proposed methods obtained for DIBCO'17 dataset in [37] in Table II. The situation here is the same: no new method was good enough to improve results of the 2017 year.…”

Section: Resultsmentioning

confidence: 91%

U-Net-bin: hacking the document image binarization contest

Bezmaternykh¹,

Ilin²,

Nikolaev

2019

Computer Optics

View full text Add to dashboard Cite

Image binarization is still a challenging task in a variety of applications. In particular, Document Image Binarization Contest (DIBCO) is organized regularly to track the state-of-the-art techniques for the historical document binarization. In this work we present a binarization method that was ranked first in the DIBCO`17 contest. It is a convolutional neural network (CNN) based method which uses U-Net architecture, originally designed for biomedical image segmentation. We describe our approach to training data preparation and contest ground truth examination and provide multiple insights on its construction (so called hacking). It led to more accurate historical document binarization problem statement with respect to the challenges one could face in the open access datasets. A docker container with the final network along with all the supplementary data we used in the training process has been published on Github.

show abstract

A selectional auto-encoder approach for document image binarization

Cited by 139 publications

References 34 publications

DeepOtsu: Document enhancement and binarization using iterative deep learning

DeepOtsu: Document enhancement and binarization using iterative deep learning

Deep Neural Networks for Document Processing of Music Score Images

U-Net-bin: hacking the document image binarization contest

Contact Info

Product

Resources

About