2022
DOI: 10.36227/techrxiv.19310489.v3
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

DocXClassifier: High Performance Explainable Deep Network for Document Image Classification

Abstract: <p>Convolutional Neural Networks (ConvNets) have been thoroughly researched for document image classification and are known for their exceptional performance in unimodal image-based document classification. Recently, however, there has been a sudden shift in the field towards multimodal approaches that simultaneously learn from the visual and textual features of the documents. While this has led to significant advances in the field, it has also led to a waning interest in improving pure ConvNets-based ap… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
19
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1
1

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(20 citation statements)
references
References 28 publications
1
19
0
Order By: Relevance
“…In addition, the unintentional memorization [17] of training samples in these models could directly expose information about the training dataset. Surprisingly, while a plethora of research has been conducted on both document classification [5,6,35] and privacy in textual documents [13,14,18,30], we found no existing literature in the field that addresses the issue of data privacy and potential information leakage from AI-powered document image classification systems. In this work, therefore, we investigate the potential of latest privacy preservation techniques [22,23,26,28] in combination with state-of-the-art DL-based document image classification models to assess whether they can achieve sufficient utility under strong privacy constraints.…”
Section: Introductionmentioning
confidence: 84%
See 4 more Smart Citations
“…In addition, the unintentional memorization [17] of training samples in these models could directly expose information about the training dataset. Surprisingly, while a plethora of research has been conducted on both document classification [5,6,35] and privacy in textual documents [13,14,18,30], we found no existing literature in the field that addresses the issue of data privacy and potential information leakage from AI-powered document image classification systems. In this work, therefore, we investigate the potential of latest privacy preservation techniques [22,23,26,28] in combination with state-of-the-art DL-based document image classification models to assess whether they can achieve sufficient utility under strong privacy constraints.…”
Section: Introductionmentioning
confidence: 84%
“…From the work of Ferrando et al (2020) [6], we investigate the EfficientNet-B4 [61], which showed the highest performance on the RVL-CDIP [34] dataset at the time. From the work of Saifullah et al (2022) [35], we investigated both ConvNext-B [2] and DocXClassifier-B [35] models, which demonstrate the current state-of-the-art performance in image-based document classification on both RVL-CDIP [34] and Tobacco3482 datasets. Finally, since Vision Transformers (ViTs) have also been explored in multiple recent studies [48][49][50][51] and show promising results, we also investigated two standard ViTs-namely, ViT-B/16 [50] and ViT-L/32 [50]-to assess their performance in comparison to the CNN architectures under private training.…”
Section: Modelsmentioning
confidence: 99%
See 3 more Smart Citations