With the rapid development of document digitization, people have become accustomed to capturing and processing documents using electronic devices such as smartphones. However, the captured document images often suffer from issues like shadows and noise due to environmental factors, which can affect their readability. To improve the quality of captured document images, researchers have proposed a series of models or frameworks and applied them in distinct scenarios such as image enhancement, and document information extraction. In this paper, we primarily focus on shadow removal methods and open-source datasets. We concentrate on recent advancements in this area, first organizing and analyzing nine available datasets. Then, the methods are categorized into conventional methods and neural network-based methods. Conventional methods use manually designed features and include shadow map-based approaches and illumination-based approaches. Neural network-based methods automatically generate features from data and are divided into single-stage approaches and multi-stage approaches. We detail representative algorithms and briefly describe some typical techniques. Finally, we analyze and discuss experimental results, identifying the limitations of datasets and methods. Future research directions are discussed, and nine suggestions for shadow removal from document images are proposed. To our knowledge, this is the first survey of shadow removal methods and related datasets from document images.