Proceedings of the 29th ACM International Conference on Multimedia 2021
DOI: 10.1145/3474085.3475388
|View full text |Cite
|
Sign up to set email alerts
|

DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction

Abstract: Figure 1: Qualitative rectified results of our Document Image Transformer (DocTr). The top row shows the distorted document images. The second row shows the rectified results after geometric unwarping and illumination correction.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 47 publications
(12 citation statements)
references
References 37 publications
(88 reference statements)
0
12
0
Order By: Relevance
“…In particular, Bako et al (2016) implemented a novel method based on explicitly identifying shadowed regions and increasing their brightness. Further, other approaches bridge multiple methods, such as Feng et al (2021), who implements a transformer machine learning model that simultaneously deskews scanned documents ("geometric unwarping") and removes their shadows ("illumination correction"). 4 .…”
Section: Color Corrections With Grayscale Outputmentioning
confidence: 99%
“…In particular, Bako et al (2016) implemented a novel method based on explicitly identifying shadowed regions and increasing their brightness. Further, other approaches bridge multiple methods, such as Feng et al (2021), who implements a transformer machine learning model that simultaneously deskews scanned documents ("geometric unwarping") and removes their shadows ("illumination correction"). 4 .…”
Section: Color Corrections With Grayscale Outputmentioning
confidence: 99%
“…Recently, Amir et al [26] propose to learn the orientation of words in a document and Das et al [6] propose to model the 3D shape of a document with a UNet [32]. Feng et al [9] introduce transformer [36] from natural language processing tasks to improve the feature representation. Das et al [7] predict local deformation fields and stitch them together with global information to obtain an improved unwarping.…”
Section: Rectification Based On Deepmentioning
confidence: 99%
“…However, it involves extra implicit learning to localize the foreground document besides predicting the rectification, which limits the performance. Hence, following [9,10], we adopt a preprocessing operation to remove the clustered background first, thus the following network can focus on the rectification of the distortion.…”
Section: Preprocessingmentioning
confidence: 99%
See 2 more Smart Citations