Shadow removal for document images is a major task for digitized document applications. Recent shadow removal models have been trained on pairs of shadow images and shadow-free images. However, obtaining a large-scale and diverse dataset is laborious and remains a great challenge. Thus, only small real datasets are available. To create relatively large datasets, a graphic renderer has been used to synthesize shadows, nonetheless, it is still necessary to capture real documents. Thus, the number of unique documents is limited, which negatively affects a network's performance. In this paper, we present a large-scale and diverse dataset called fully synthetic document shadow removal dataset (FSDSRD) that does not require capturing documents. The experiments showed that the networks (pre-)trained on FSDSRD provided better results than networks trained only on real datasets. Additionally, because foreground maps are available in our dataset, we leveraged them during training for multitask learning, which provided noticeable improvements. The code is available at: https://github.com/IsHYuhi/DSRFGD.
Figure 1. We generate the plausible environment from a narrow field of view image, using a transformer-based outpainting method that considers the nature of 360-degree images, to realize efficient 3DCG scene creation. See also demonstrations in the supplementary video.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.