As a critical component of many remote sensing satellites and model validation, pixel-scale surface quantitative parameters are often affected by scale effects in the acquisition process, resulting in deviations in the accuracy of image scale parameters. Consequently, various successive scale conversion methods have been proposed to correct the errors caused by scale effects. In this study, we propose ResTransformer, a deep learning model for scale conversion of surface reflectance using UAV images, which fully extracts and fuses the features of UAV images in the sample area and sample points and establishes a high-dimensional nonlinear spatial correlation between sample points and sample area in the target sample area, so that the scale conversion of surface reflectance at the pixel-scale can be completed quickly and accurately. We collected and created a dataset of 500k samples to verify the accuracy and robustness of the model with other traditional scale conversion methods. The results show that the ResTransformer deep learning model works best, providing average MRE, average MRSE, and correlation coefficient R values of 0.6440%, 0.7460, and 0.99911, respectively, and the baseline improvements compared with the Simple Average method are 92.48%, 92.45%, and 16.59%, respectively. The ResTransformer model also shows the highest robustness and universality and can adapt to surface pixel-scale conversion scenarios with different sizes, heterogeneous sample areas, and arbitrary sampling methods. This method provides a promising, highly accurate, and robust method for converting pixel-scale surface reflectance scale.