Photovoltaic (PV) power generation has attracted widespread interest as a clean and sustainable energy source, with increasing global attention given to renewable energy. However, the operation and monitoring of PV power generation systems often result in large amounts of data containing missing values, outliers, and noise, posing challenges for data analysis and application. Therefore, PV data cleaning plays a crucial role in ensuring data quality, enhancing data availability and reliability. This study proposes a PV data cleaning method based on Rasterized Data Image Processing (RDIP) technology, which integrates rasterization and image processing techniques to select optimal contours and extract essential data. To validate the effectiveness of our method, we conducted comparative experiments using three data cleaning methods, including our RDIP algorithm, the Pearson correlation coefficient interpolation method, and cubic spline interpolation method. Subsequently, the cleaned datasets from these methods were utilized for power prediction using two linear regression models and two neural network models. The experimental results demonstrated that data cleaned using the RDIP algorithm improved the short-term forecast accuracy by approximately 1.0% and 3.7%, respectively, compared to the other two methods, indicating the feasibility and effectiveness of the RDIP approach. However, it is worth noting that the RDIP technique has limitations due to its reliance on integer parameters for grid division, potentially leading to coarse grid divisions. Future research efforts could focus on optimizing the selection of binarization thresholds to achieve better cleaning results and exploring other potential applications of RDIP in PV data analysis.