2019 IEEE/CVF International Conference on Computer Vision (ICCV) 2019
DOI: 10.1109/iccv.2019.00022
|View full text |Cite
|
Sign up to set email alerts
|

DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

2
170
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 71 publications
(172 citation statements)
references
References 36 publications
2
170
0
Order By: Relevance
“…Our implementation takes around 0.67 to 0.72 seconds to process a 1024x960 image. We compare our results with Ma et al [15] and Das and Ma et al [6] on the real-world document images. Compared with previous method, our proposal can rectify various distortions while removing background and replace it to transparent (the visual comparison is shown in Fig.…”
Section: Experimental Setup and Resultsmentioning
confidence: 85%
See 1 more Smart Citation
“…Our implementation takes around 0.67 to 0.72 seconds to process a 1024x960 image. We compare our results with Ma et al [15] and Das and Ma et al [6] on the real-world document images. Compared with previous method, our proposal can rectify various distortions while removing background and replace it to transparent (the visual comparison is shown in Fig.…”
Section: Experimental Setup and Resultsmentioning
confidence: 85%
“…Because of the generated dataset is quite different from the real-world image, [15] trained on its dataset has worse generalization when tested on real-world images. Das and Ma et al [6] think dewarping model was not always perform well when trained by the synthetic training dataset only used 2D deformation, so they created a Doc3D dataset which has multiple types of pixel-wise document image ground truth by using both real-world document and rendering software. Meanwhile, [6] proposed a dewarping network and refinement network to correct geometric and shading of document images.…”
Section: Related Workmentioning
confidence: 99%
“…Extensive experiments on several datasets, i.e., Doc3D, DRIC, and DocUNet dataset, demonstrate the effectiveness and superiority of our DocTr over the existing stateof-the-art methods on both tasks. Notably, on DocUNet benchmark [22], we achieve significant improvement on OCR results (absolutely 15.32% Character Error Rate (CER) reduced compared to the state-of-the-art method [7]). Furthermore, our method shows high efficiency on inference time and parameter count.…”
Section: Introductionmentioning
confidence: 92%
“…Therefore, learning-based methods using only a single distorted image are being pursued [ 10 , 11 , 12 , 42 , 43 , 44 , 45 , 46 ]. Deep learning for correcting documents were proposed recently [ 12 , 44 , 45 , 46 ] which implements convolutional neural networks, encoder-decoders, and U-net-based architectures [ 47 ]. Work on correcting portrait images used an encoder-decoder architecture [ 10 ].…”
Section: Related Workmentioning
confidence: 99%