The determination of the evapotranspiration (ET) and its components in urban woodlands is crucial to mitigate the urban heat island effect and improve sustainable urban development. However, accurately estimating ET in urban areas is more difficult and challenging due to the heterogeneity of the underlying surface and the impact of human activities. In this study, we compared the performance of three types of classic two-source ET models on urban woodlands in Shenzhen, China. The three ET models include a pure physical and process-based ET model (Shuttleworth–Wallace model), a semi-empirical and physical process-based ET model (FAO dual-Kc model), and a purely statistical and process-based ET model (deep neural network). The performance of the three models was validated using an eddy correlation and stable hydrogen and oxygen isotope observations. The verification results suggested that the Shuttleworth–Wallace model achieved the best performance in the ET simulation at main urban area site (coefficient of determination (R2) of 0.75). The FAO-56 dual Kc model performed best in the ET simulation at the suburb area site (R2 of 0.77). The deep neural network could better capture the nonlinear relationship between ET and various environmental variables and achieved the best simulation performance in both of the main urban and suburb sites (R2 of 0.73 for the main urban and suburb sites, respectively). A correlation analysis showed that the simulation of urban ET is most sensitive to temperature and least sensitive to wind speed. This study further analyzed the causes for the varying performance of the three classic ET models from the model mechanism. The results of the study are of great significance for urban temperature cooling and sustainable urban development.