Climate projections at fine spatial resolutions are required to conduct accurate risk assessment for critical infrastructure and design adaptation planning. Generating these projections using advanced Earth system models (ESM) requires significant computational resources. To address this issue, various statistical downscaling techniques have been introduced to generate fine-resolution data from coarse-resolution simulations. In this study, we evaluate and compare five deep learning-based downscaling techniques, namely, super-resolution convolutional neural networks, fast super-resolution convolutional neural network ESM, efficient sub-pixel convolutional neural network, enhanced deep residual network (EDRN), and super-resolution generative adversarial network (SRGAN). These techniques are applied to a dataset generated by the Energy Exascale Earth System Model (E3SM), focusing on key surface variables such as surface temperature, shortwave heat flux, and longwave heat flux. Models are trained and validated using paired fine-resolution (0.25$$^{\circ }$$
∘
) and coarse-resolution (1$$^{\circ }$$
∘
) monthly data obtained from a 9-year simulation. Next, blind testing is performed using monthly data obtained from two different years outside of the training and validation set. To evaluate the efficiency of each technique, different statistical metrics are used, including mean squared error (MSE), peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), and learned perceptual image patch similarity (LPIPS). The results show that EDRN outperforms other algorithms in terms of PSNR, SSIM, and MSE, but struggles to capture fine-scale features in the data. In contrast, SRGAN, a generative model that uses perceptual loss, excels in capturing fine details at boundaries and internal structures, resulting in lower LPIPS than other methods.