DoT-Net: Document Layout Classification Using Texture-Based CNN

Kosaraju, Sai; Masum, Mohammad; Tsaku, Nelson Zange; Patel, Pritesh; Bayramoglu, Tanju; Modgil, Girish; Kang, Min-Hee

doi:10.1109/icdar.2019.00168

Cited by 25 publications

(3 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Borges Oliveira and Viana (2017) proposed a fast automatic document layout method based on convolutional neural networks (CNN), which greatly improved overall performance. Moreover, Kosaraju et al (2019) proposed a texture-based convolutional neural network model called DoT-Net, which can effectively recognize document component blocks such as text, images, tables, mathematical expressions, and line graphs, solving the problems caused by location transformations, inter-class and intraclass variations, and background noise. Singh and Karayev (2021) unveil an architecture for a Handwritten Text Recognition (HTR) model based on Neural Networks, which is capable of recognizing complete pages of handwritten or printed text without the need for image segmentation.…”

Section: Related Work Deep Learning Based Genealogy Layout Recognitionmentioning

confidence: 99%

Sublinear information bottleneck based two-stage deep learning approach to genealogy layout recognition

You

Wang

2023

Front. Neurosci.

View full text Add to dashboard Cite

As an important part of human cultural heritage, the recognition of genealogy layout is of great significance for genealogy research and preservation. This paper proposes a novel method for genealogy layout recognition using our introduced sublinear information bottleneck (SIB) and two-stage deep learning approach. We first proposed an SIB for extracting relevant features from the input image, and then uses the deep learning classifier SIB-ResNet and object detector SIB-YOLOv5 to identify and localize different components of the genealogy layout. The proposed method is evaluated on a dataset of genealogy images and achieves promising results, outperforming existing state-of-the-art methods. This work demonstrates the potential of using information bottleneck and deep learning object detection for genealogy layout recognition, which can have applications in genealogy research and preservation.

show abstract

Section: Related Work Deep Learning Based Genealogy Layout Recognitionmentioning

confidence: 99%

Sublinear information bottleneck based two-stage deep learning approach to genealogy layout recognition

You

Wang

2023

Front. Neurosci.

View full text Add to dashboard Cite

show abstract

“…Chen et al 25 proposed a CNN for historical newspaper segmentation to distinguish text content from the background and other content types, such as figures, decoration, and comments. Kosaraju et al 27 adopted a CNN network with a dilated convolutional kernel to analyze document layouts. Renton et al 28 proposed a CNN-based network to segment handwritten text lines that have various issues such as slanted lines, overlapped texts, and inconsistent handwritten characters.…”

Section: Image Classification Using Cnnmentioning

confidence: 99%

Investigating coupling preprocessing with shallow and deep convolutional neural networks in document image classification

2021

View full text Add to dashboard Cite

Convolutional neural networks (CNNs) are effective for image classification, and deeper CNNs are being used to improve classification performance. Indeed, as needs increase for searchability of vast printed document image collections, powerful CNNs have been used in place of conventional image processing. However, better performances of deep CNNs come at the expense of computational complexity. Are the additional training efforts required by deeper CNNs worth the improvement in performance? Or could a shallow CNN coupled with conventional image processing (e.g., binarization and consolidation) outperform deeper CNNbased solutions? We investigate performance gaps among shallow (LeNet-5, -7, and -9), deep (ResNet-18), and very deep (ResNet-152, MobileNetV2, and EfficientNet) CNNs for noisy printed document images, e.g., historical newspapers and document images in the RVL-CDIP repository. Our investigation considers two different classification tasks: (1) identifying poems in historical newspapers and (2) classifying 16 document types in document images. Empirical results show that a shallow CNN coupled with computationally inexpensive preprocessing can have a robust response with significantly reduced training samples; deep CNNs coupled with preprocessing can outperform very deep CNNs effectively and efficiently; and aggressive preprocessing is not helpful as it could remove potentially useful information in document images.

show abstract

“…Wu et al (2023) proposed a dynamic fusion network based on features for layout analysis, giving an F1 score of 89.5% on the DSSE dataset and 95.1% on the CS-150. Kosaraju et al (2019) proposed a multiclass classifier to segment documents into various components using a convolutional neural network. Bin Makhashen and Mahmoud (2019) surveyed various layout analysis techniques, features, advantages, and disadvantages.…”

Section: Introductionmentioning

confidence: 99%

A Hybrid Approach for Complex Layout Detection of Newspapers in Gurumukhi Script Using Deep Learning

Kumar,

Lehal

2023

IJERR

View full text Add to dashboard Cite

Layout analysis is the crucial stage in the recognition system of newspapers. A good layout analysis results in better recognition results. In this paper, we detected the complex layout of newspapers in the Gurumukhi script. We have used a hybrid approach. In this approach, firstly, we proposed an algorithm to remove pictures from newspaper images that involves various image preprocessing tasks based on binarization, finding contours, and erosion on the image to remove the graphics from the image. This method also removes pictures from complex non-Manhattan layouts. Finally, we have trained the deep-leaning model based on a convolutional network to detect the columns of text from newspapers. We have created a dataset of 500 images labelled with five classes on which the model was trained. We have tested this method on the number of newspapers of the Gurumukhi script. The results show very good accuracy with this hybrid approach of layout detection.

show abstract

DoT-Net: Document Layout Classification Using Texture-Based CNN

Cited by 25 publications

References 19 publications

Sublinear information bottleneck based two-stage deep learning approach to genealogy layout recognition

Sublinear information bottleneck based two-stage deep learning approach to genealogy layout recognition

Investigating coupling preprocessing with shallow and deep convolutional neural networks in document image classification

A Hybrid Approach for Complex Layout Detection of Newspapers in Gurumukhi Script Using Deep Learning

Contact Info

Product

Resources

About