Spatially distributed high-resolution data of land surface temperature (LST) and evapotranspiration (ET) are important information for crop water management and other applications in the agricultural sector. While satellite data can provide LST high-resolution data of 100 m, the current development of unmanned aerial systems (UAS) and affordable low-weight thermal cameras allows LST and subsequent ET to be derived at resolutions down to centimetre scale.In this study, UAS-based images in the thermal infrared (TIR) and visible spectral range were collected over a managed temperate grassland in July 2016 at the Terrestrial Environmental Observatories Networks TERENO preAlpine observatory site at Fendt, Germany. The UAS set-up included a lightweight thermal camera (Optris Pi Lightweight) and a regular digital camera (Sony α 6000) that allowed for the acquisition of thermal and optical images with a ground resolution of 5 cm and 1 cm, respectively. Three TIR-based ET models of different complexity were applied and the resulting ET estimates were compared to the Eddy covariance (EC) observations of turbulent energy fluxes and also to the evaporative fraction. While the Deriving Atmosphere Turbulent Transport Useful To Dummies Using Temperature (DATTUTDUT) model and the Triangle Method belong to the group of simpler contextual models, the Two-Source Energy Balance (TSEB) model incorporates a more physically based formulation of the surface energy balance. In addition to the comparison of UAS-based estimates of latent heat fluxes to EC observations, the effect of the spatial resolution of the model imagery input on the modelled results was analysed by running the models with imagery from the native resolution of the acquired images to resolutions that were aggregated up to 30 m.The results show that both contextual models are sensitive to the input image resolution and that the agreement with the EC observations improves with increasing image resolution. The TSEB model assumes that LST pixels represent a mixed signal of the soil and canopy components, thus an image resolution coarse enough to ensure this assumption should be chosen. However, with the exception of the native image resolution of 5 cm, we found no effect of image resolution on the spatially weighted mean TSEB estimates.For the studied grassland, the comparison of model estimates with EC observations indicates that all three models are able to reproduce observed energy fluxes with comparable accuracy with mean absolute errors of ET between 20 and 40 W m−2. The TSEB model showed larger deviations from the reference observations under cloudy conditions with rapid fluctuations of LST within the 30 min averaging period for EC. The two contextual models yielded similar results for most of the flights. The good performance of the DATTUTDUT model, which had the lowest input requirements of the three models, is especially promising in view of the application of UAS for routine near-real-time ET monitoring.