Field-scale surface soil moisture (SSM, 0-10 cm), which is closely linked with land surface temperature (LST), is particularly important to agricultural water resource management. Active and passive microwave remote sensing-based SSM retrievals on the order of kilometer squared resolutions are difficult to apply to heterogeneous agricultural land surfaces that may need SSM data at a resolution of 30 m. In this study, the High-resolution Urban Thermal Sharpener and Enhanced Spatial and Temporal Adaptive Reflectance Fusion Model were applied to downscale optical and thermal remote sensing data simultaneously by blending Landsat and MODIS red-near infrared-LST data, with the ultimate goal to generate field-scale SSM values from the trapezoidal approach. To evaluate the performance of the downscaled LST E (based on the Enhanced Spatial and Temporal Adaptive Reflectance Fusion Model method) and SSM, an irrigation district (Area 1) in Inner Mongolia and an irrigation district in the North China Plain (Area 2) with varying spatial heterogeneity were selected as the testbeds. Results indicated that the downscaled LST E was highly consistent with synchronous Landsat LST H and in situ LST measurements in Area 1, with the root-mean-square error ranging from 0.73 to 2.75 K. Compared with the MODIS SSM, the average root-mean-square error of the downscaled SSM improved from 0.048 to 0.038 cm 3 /cm 3 for both areas. The downscaled LST E and SSM developed in this study enhance the spatiotemporal resolutions of the SSM estimates, maximizing the potential of remotely sensed information for agricultural water resource management.Plain Language Summary Field-scale (30 m) surface soil moisture (SSM), closely linked with land surface temperature (LST), is particularly important for agricultural water resource management, such as for assessment of agricultural droughts, optimization of irrigation schedules and improvement of water use efficiency, particularly in the heterogeneous agricultural land. Here, the High resolution Urban Thermal Sharpener (HUTS) and Enhanced Spatial and Temporal Adaptive Reflectance Fusion Model (ESTARFM) were jointly adopted to downscale optical and thermal remote sensing data simultaneously by blending Landsat and MODIS Red-Near infrared-LST data, with the ultimate goal to generate field-scale SSM values from the downscaled Red-Near infrared-LST remote sensing data using the theoretical trapezoidal approach. The field-scale LST and SSM developed in this study maximize the potential of remotely sensed information, improve both the spatial and temporal resolutions of SSM, and provide more valuable information on heterogeneous land surfaces for agricultural water resource management. Key Points: • Downscaled LSTE (30 m), closely linked with SSM, can be acquired jointly using HUTS and ESTARFM methods • Downscaled field-scale SSM (30 m) well represents the actual soil moisture than MODIS SSM in heterogeneous land • Downscaled Red-Near infrared-LSTE improves both the spatial and temporal resolution of SSM