Convolutional Neural Networks for Document Image Classification

Kang, Le; Kumar, Jayant; Ye, Peng; Yi, Li; Doermann, David

doi:10.1109/icpr.2014.546

Cited by 153 publications

(91 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…In the present work, deep convolutional neural networks (DCNN) are used for automatically understanding the structural aspects of a document for the purpose of classification. While DCNN based approaches are not new to this area [7]- [9], the present study distinguishes itself by studying the rapid training of effective document region based classifiers. To achieve the same, multiple levels of transfer learning are used.…”

Section: A Contributionmentioning

confidence: 99%

Document Image Classification with Intra-Domain Transfer Learning and Stacked Generalization of Deep Convolutional Neural Networks

Das¹,

Roy

Bhattacharya

et al. 2018

2018 24th International Conference on Pattern Recognition (ICPR)

View full text Add to dashboard Cite

show abstract

Section: A Contributionmentioning

confidence: 99%

Document Image Classification with Intra-Domain Transfer Learning and Stacked Generalization of Deep Convolutional Neural Networks

Das¹,

Roy

Bhattacharya

et al. 2018

2018 24th International Conference on Pattern Recognition (ICPR)

View full text Add to dashboard Cite

show abstract

“…Convolutional neural networks have been used extensively on document images, e.g. segmenting documents [Yang et al, 2017, Chen et al, 2017a, Wick and Puppe, 2018, spotting handwritten words [Sudholt and Fink, 2016], classifying documents [Kang et al, 2014] and more broadly detecting text in natural scenes [Liao et al, 2017, Borisyuk et al, 2018. In contrast to our task, these are trained on explicitly labeled datasets with information on where the targets are, e.g.…”

Section: End-to-end Methodsmentioning

confidence: 99%

Attend, Copy, Parse End-to-end Information Extraction from Documents

Palm¹,

Laws²,

Winther

2019

2019 International Conference on Document Analysis and Recognition (ICDAR)

View full text Add to dashboard Cite

Document information extraction tasks performed by humans create data consisting of a PDF or document image input, and extracted string outputs. This end-to-end data is naturally consumed and produced when performing the task because it is valuable in and of itself. It is naturally available, at no additional cost. Unfortunately, state-of-the-art word classification methods for information extraction cannot use this data, instead requiring word-level labels which are expensive to create and consequently not available for many real life tasks. In this paper we propose the Attend, Copy, Parse architecture, a deep neural network model that can be trained directly on end-toend data, bypassing the need for word-level labels. We evaluate the proposed architecture on a large diverse set of invoices, and outperform a state-of-the-art production system based on word classification. We believe our proposed architecture can be used on many real life information extraction tasks where word classification cannot be used due to a lack of the required word-level labels. 1

show abstract

“…There is also a vast amount of literature on constructing classifiers [8][9][10][11][12][13][14][15][16][17]. There exist a myriad methods to partition our multidimensional feature space into several classification regions.…”

Section: Resultsmentioning

confidence: 99%

“…The schemes aim to distinguish the content of the input document image, such as the ad, email, news and report. The manner [17] can achieve higher accuracy than [15] by utilizing speeded up robust features (SURF). Consequently, to design an efficient copy mode selection for low-end digital copier, the complexity, time consuming and accuracy should be the major concerns.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Efficient and accurate document image classification algorithms for low-end copy pipelines

Huang¹,

Chen²,

Lin³

et al. 2016

J Image Video Proc.

View full text Add to dashboard Cite

The copy mode selection, such as the text mode and photo mode, of a digital copy machine can provide suitable process and enhancement for the scanned image. To classify the scanned image without expensive hardware and reduce the running time, in this article, we designed an efficient automatic method for classifying a document image using a probabilistic decision strategy. The proposed algorithm is tailored to inexpensive hardware and significantly reduces both the running time and memory requirements compared to the existing algorithms, while substantially improving the classification accuracy. In addition, we incorporate a new classification module to help avoid moiré patterns by identifying periodic halftone noise.

show abstract

Convolutional Neural Networks for Document Image Classification

Cited by 153 publications

References 12 publications

Document Image Classification with Intra-Domain Transfer Learning and Stacked Generalization of Deep Convolutional Neural Networks

Document Image Classification with Intra-Domain Transfer Learning and Stacked Generalization of Deep Convolutional Neural Networks

Attend, Copy, Parse End-to-end Information Extraction from Documents

Efficient and accurate document image classification algorithms for low-end copy pipelines

Contact Info

Product

Resources

About