Evapotranspiration (ET) is a key indicator of arid and semi-arid ecosystem processes and hydrological cycles. The study of basin-scale ET characteristics and drivers can provide a better understanding of regional water balance and energy cycles. This study used the Pixel Information Expert Engine platform based on MODIS (MOD16A2) data to extract the separate spatial and temporal characteristics of interannual and seasonal ET in the Urumqi River Basin in Xinjiang, China, over a 20-year period, from 2000 to 2020, and to analyze the influence of land-use data and altitude on ET in the basin. The average interannual ET in the watershed has had an increasing trend over the past two decades, varying from 126.57 mm to 247.66 mm, with the maximum ET in July and the minimum in December. On the seasonal scale, the ET trend is greatest in summer, followed by spring, and it is the least in winter. Spatially, the surface ET in the Urumqi River Basin is generally high in the upstream area and low in the downstream area, with the average multi-year ET throughout the basin falling within the range of 22.74–479.33 mm. The average ET for each type of land use showed that forest land had the highest ET and unused land the lowest. Analysis found that the effect of altitude on ET was more pronounced, with a significant increase in ET as altitude increases. Analysis of the drivers of ET change from 2000 to 2020 using the Optimal Parameters-based Geographical Detector model (OPGD) showed that the natural factors that had the greatest influence were, in descending order, temperature > vegetation cover > precipitation. Among the interacting factors, vegetation index with temperature, elevation, and precipitation and land use with elevation had a relatively greater influence on ET in the basin, and the effects of interacting factors were all greater than those of single factors.