Abstract. Evapotranspiration (ET) plays an important role in surface-atmosphere interactions and can be monitored using remote sensing data. However, surface heterogeneity, including the inhomogeneity of landscapes and surface variables, significantly affects the accuracy of ET estimated from satellite data. The objective of this study is to assess and reduce the uncertainties resulting from surface heterogeneity in remotely sensed ET using Chinese HJ-1B satellite data, which is of 30 m spatial resolution in VIS/NIR bands and 300 m spatial resolution in the thermal-infrared (TIR) band. A temperature-sharpening and flux aggregation scheme (TSFA) was developed to obtain accurate heat fluxes from the HJ-1B satellite data. The IPUS (input parameter upscaling) and TRFA (temperature resampling and flux aggregation) methods were used to compare with the TSFA in this study. The three methods represent three typical schemes used to handle mixed pixels from the simplest to the most complex. IPUS handles all surface variables at coarse resolution of 300 m in this study, TSFA handles them at 30 m resolution, and TRFA handles them at 30 and 300 m resolution, which depends on the actual spatial resolution. Analyzing and comparing the three methods can help us to get a better understanding of spatial-scale errors in remote sensing of surface heat fluxes. In situ data collected during HiWATER-MUSOEXE (Multi-Scale Observation Experiment on Evapotranspiration over heterogeneous land surfaces of the Heihe Watershed Allied Telemetry Experimental Research) were used to validate and analyze the methods. ET estimated by TSFA exhibited the best agreement with in situ observations, and the footprint validation results showed that the R 2 , MBE, and RMSE values of the sensible heat flux (H ) were 0.61, 0.90, and 50.99 W m −2 , respectively, and those for the latent heat flux (LE) were 0.82, −20.54, and 71.24 W m −2 , respectively. IPUS yielded the largest errors in ET estimation. The RMSE of LE between the TSFA and IPUS methods was 51.30 W m −2 , and the RMSE of LE between the TSFA and TRFA methods was 16.48 W m −2 . Furthermore, additional analysis showed that the TSFA method can capture the subpixel variations of land surface temperature and the influences of various landscapes within mixed pixels.