Simplifying the reading of historical manuscripts

Asi, Abedelkadir; Cohen, Rafi; Kedem, Klara; El-Sana, Jihad

doi:10.1109/icdar.2015.7333877

Cited by 21 publications

(10 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Document layout segmentation using heuristic methods is classified mainly under three different categories: top-down, bottom-up and hybrid strategies. The bottom-up methods [2,31,34] used pixels as basic components, and performed operations like merging and grouping to form a larger homogeneous region. On the other hand, top-down methods [20,23,29] relied on splitting the whole document image iteratively into different regions, until a definite standard column or block was obtained.…”

Section: Heuristic Rule-based Document Layout Analysismentioning

confidence: 99%

DocSegTr: An Instance-Level End-to-End Document Image Segmentation Transformer

Biswas¹,

Banerjee²,

Lladós³

et al. 2022

Preprint

View full text Add to dashboard Cite

Understanding documents with rich layouts is an essential step towards information extraction. Business intelligence processes often require the extraction of useful semantic content from documents at a large scale for subsequent decision-making tasks. In this context, instance-level segmentation of different document objects(title, sections, figures, tables and so on) has emerged as an interesting problem for the document layout analysis community. To advance the research in this direction, we present a transformer-based model for end-to-end segmentation of complex layouts in document images. To our knowledge, this is the first work on transformer-based document segmentation. Extensive experimentation on the PubLayNet dataset shows that our model achieved comparable or better segmentation performance than the existing state-of-the-art approaches. We hope our simple and flexible framework could serve as a promising baseline for instance-level recognition tasks in document images.

show abstract

Section: Heuristic Rule-based Document Layout Analysismentioning

confidence: 99%

DocSegTr: An Instance-Level End-to-End Document Image Segmentation Transformer

Biswas¹,

Banerjee²,

Lladós³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Saabni and El-Sana [40] proposed a method that developed seem lines among text lines using energy maps. Asi et al [2] came up with a multi-scale texture-based algorithm for document images where Gabor filters were applied to locate different regions and a minimization energy function was applied to segment them. Despite successes of both top-down and bottom-up strategies, there are techniques [51] that have integrated both of them to segment regions in digital documents with complex layouts.…”

Section: Traditional Document Layout Segmentationmentioning

confidence: 99%

Beyond document object detection: instance-level segmentation of complex layouts

Biswas

Riba

Lladós

et al. 2021

IJDAR

View full text Add to dashboard Cite

Information extraction is a fundamental task of many business intelligence services that entail massive document processing. Understanding a document page structure in terms of its layout provides contextual support which is helpful in the semantic interpretation of the document terms. In this paper, inspired by the progress of deep learning methodologies applied to the task of object recognition, we transfer these models to the specific case of document object detection, reformulating the traditional problem of document layout analysis. Moreover, we importantly contribute to prior arts by defining the task of instance segmentation on the document image domain. An instance segmentation paradigm is especially important in complex layouts whose contents should interact for the proper rendering of the page, i.e., the proper text wrapping around an image. Finally, we provide an extensive evaluation, both qualitative and quantitative, that demonstrates the superior performance of the proposed methodology over the current state of the art.

show abstract

“…In [7], connected components are aggregated before vertical and horizontal white spaces are detected to produce a mask of areas of interest. Asi et al [5] manage to simplify the layout of historical documents by locating, segmenting, and dewarping text lines with severe curvature. These strategies have also been applied to segment text-lines.…”

Section: Hybrid Strategiesmentioning

confidence: 99%

Combination of deep neural networks and logical rules for record segmentation in historical handwritten registers using few examples

Tarride

Lemaitre

Coüasnon

et al. 2021

IJDAR

View full text Add to dashboard Cite

This work focuses on the layout analysis of historical handwritten registers, in which local religious ceremonies were recorded. The aim of this work is to delimit each record in these registers. To this end, two approaches are proposed. Firstly, object detection networks are explored, as three state-of-the-art architectures are compared. Further experiments are then conducted on Mask R-CNN, as it yields the best performance. Secondly, we introduce and investigate Deep Syntax, a hybrid system that takes advantages of recurrent patterns to delimit each record, by combining ushaped networks and logical rules. Finally, these two approaches are evaluated on 3708 French records (16-18th centuries), as well as on the Esposalles public database, containing 253 Spanish records (17th century). While both systems perform well on homogeneous documents, we observe a significant drop in performance with Mask R-CNN on heterogeneous documents, especially when trained on a non-representative subset. By contrast, Deep Syntax relies on steady patterns, and is therefore able to process a wider range of documents with less training data. Not only Deep Syntax produces 15% more match configurations and reduces the ZoneMap surface error metric by 30% when both systems are trained on 120 images, but it also outperforms Mask R-CNN when trained on a database three times smaller. As Deep Syntax generalizes better, we believe it can be used in the context of massive document processing, as collecting and annotating a sufficiently large and representative set of training data is not always achievable.

show abstract

Simplifying the reading of historical manuscripts

Cited by 21 publications

References 29 publications

DocSegTr: An Instance-Level End-to-End Document Image Segmentation Transformer

DocSegTr: An Instance-Level End-to-End Document Image Segmentation Transformer

Beyond document object detection: instance-level segmentation of complex layouts

Combination of deep neural networks and logical rules for record segmentation in historical handwritten registers using few examples

Contact Info

Product

Resources

About