Image-based logical document structure recognition

Kamola, Grzegorz; Spytkowski, Michał; Paradowski, Mariusz; Markowska–Kaczmar, Urszula

doi:10.1007/s10044-014-0412-8

Cited by 8 publications

(3 citation statements)

References 37 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In [30], the authors rely on features derived from the geometry of the document and perform hierarchical graph coloring to retrieve the structure of postal mails. In [39] text-lines are grouped based on alignment, distance and graphical features like font, thickness and color to form homogeneous zones. It is also common to gradually merge connected components to obtain text-blocks in printed documents [4,37] Clustering methods are also applied to find text-lines using generic features, such as orientation features [40,71].…”

Section: Bottom-up or Data-driven Strategiesmentioning

confidence: 99%

Combination of deep neural networks and logical rules for record segmentation in historical handwritten registers using few examples

Tarride

Lemaitre

Coüasnon

et al. 2021

IJDAR

View full text Add to dashboard Cite

This work focuses on the layout analysis of historical handwritten registers, in which local religious ceremonies were recorded. The aim of this work is to delimit each record in these registers. To this end, two approaches are proposed. Firstly, object detection networks are explored, as three state-of-the-art architectures are compared. Further experiments are then conducted on Mask R-CNN, as it yields the best performance. Secondly, we introduce and investigate Deep Syntax, a hybrid system that takes advantages of recurrent patterns to delimit each record, by combining ushaped networks and logical rules. Finally, these two approaches are evaluated on 3708 French records (16-18th centuries), as well as on the Esposalles public database, containing 253 Spanish records (17th century). While both systems perform well on homogeneous documents, we observe a significant drop in performance with Mask R-CNN on heterogeneous documents, especially when trained on a non-representative subset. By contrast, Deep Syntax relies on steady patterns, and is therefore able to process a wider range of documents with less training data. Not only Deep Syntax produces 15% more match configurations and reduces the ZoneMap surface error metric by 30% when both systems are trained on 120 images, but it also outperforms Mask R-CNN when trained on a database three times smaller. As Deep Syntax generalizes better, we believe it can be used in the context of massive document processing, as collecting and annotating a sufficiently large and representative set of training data is not always achievable.

show abstract

Section: Bottom-up or Data-driven Strategiesmentioning

confidence: 99%

Combination of deep neural networks and logical rules for record segmentation in historical handwritten registers using few examples

Tarride

Lemaitre

Coüasnon

et al. 2021

IJDAR

View full text Add to dashboard Cite

show abstract

“…Carry out automatic image segmentation. Image segmentation is meant to separate distinct elements in an image from other elements [26]. After these distinctive elements have been separated, further operations can be performed, such as identifying individual elements or measuring their size.…”

Section: Blood Vein Detection Algorithmmentioning

confidence: 99%

Low cost blood vein detection system based on near-infrared LEDs and image-processing techniques

Alwazzan

2020

Polish Journal of Medical Physics and Engineering

View full text Add to dashboard Cite

AbstractDrawing blood and injecting drugs are common medical procedures, for which accurate identification of veins is needed to avoid causing unnecessary pain. In this paper, we propose a low-cost system for the detection of veins. The system emits near-infrared radiation from four light-emitting diodes (LEDs), with a charge-coupled device (CCD) camera located in the middle of the LEDs. The camera captures an image of the palm of the hand. A series of digital image-processing techniques, ranging from image enhancement and increased contrast to isolation using a threshold limit based on statistical properties, are applied to effectively isolate the veins from the rest of the image.

show abstract

“…Nowadays, academic papers are widely available from popular databases such as Google Scholar 1 and CiNii 2 in Japan. To make the best use of papers, there has been much research into the recognition of logical structures in documents [9] and keyword extraction from academic papers [3]. In particular, tables are often used to show statistics and experimental results in academic papers, while graphical structures, rather than tabular structures, are better suited to visually comparing many values at once.…”

Section: Introductionmentioning

confidence: 99%

Table-structure recognition method using neural networks for implicit ruled line estimation and cell estimation

Ohta

Yamada

Kanazawa

et al. 2021

Proceedings of the 21st ACM Symposium on Document Engineering

View full text Add to dashboard Cite

Tables are often used to summarize accurate values in academic papers, while graphs are used to show them visually. Automatic graph generation from a table is therefore a topic of research interest. Given that the way tables are written varies depending on the author, in earlier work we proposed a cell-detection-based tablestructure recognition method. Our method achieved fair performance in experiments using the ICDAR 2013 table competition dataset, but could not outperform the top-ranked participant in the competition. This paper proposes an improved method using two neural networks: one estimates implicit ruled lines that are necessary to separate cells but are undrawn, and the other estimates cells by merging detected tokens in a table. We demonstrated the effectiveness of the proposed method by experiments using the same ICDAR 2013 dataset. It achieved an F-measure of 0.955, thereby outperforming the other methods including the top-ranked participant. CCS CONCEPTS• Applied computing → Document management and text processing; Document analysis.

show abstract

Image-based logical document structure recognition

Cited by 8 publications

References 37 publications

Combination of deep neural networks and logical rules for record segmentation in historical handwritten registers using few examples

Combination of deep neural networks and logical rules for record segmentation in historical handwritten registers using few examples

Low cost blood vein detection system based on near-infrared LEDs and image-processing techniques

Table-structure recognition method using neural networks for implicit ruled line estimation and cell estimation

Contact Info

Product

Resources

About