The estimation of cropland evapotranspiration (ET) is essential for agriculture water management, drought monitoring, and yield forecast. Remote sensing-based multi-source ET models have been widely applied and validated in the semi-arid region of China. However, careful investigation of the models’ performances for different crop types (winter wheat and summer maize) over the semi-humid region is still necessary. This study used remote sensing data (Landsat 8 and ASTER) and compared three mainstream multi-source ET models: (i) the two-source energy balance model, i.e., TSEB; (ii) the Penman-Monteith based four-source model, i.e., 4s-PM; (iii) the Priestley Taylor-Jet Propulsion Laboratory ET algorithm, i.e., PT-JPL. The measurements of the eddy-covariance (EC) flux tower located in Guantao county of North China were used to validate the models. The results showed that the TSEB model performed the best in estimating latent heat flux (LE) of maize, with an RMSE of 75.0 W/m2 and an R2 of 0.9, and the 4s-PM model had the highest accuracy of LE estimation for wheat, with an RMSE of 61.0 W/m2 and an R2 of 0.91. The LE spatial distribution comparison indicated that the PT-JPL model had more capacity to exhibit crop ET heterogeneity. The major environmental factors affecting ET varied with crop types and crop growth stages. Without taking soil moisture into account, the 4s-PM and TSEB models overestimated LE under water deficit in the maturation stage of wheat. The plant moisture stress based on vegetation index in the PT-JPL model underestimated the evaporation in the maturation stage while the cropland was still wet.