In arid and semi-arid areas, soil moisture (SM) plays a crucial role in land-atmosphere interactions, hydrological processes, and ecosystem sustainability. SM data at large scales are critical for related climatic, hydrological, and ecohydrological research. Data fusion based on satellite products and model simulations is an important way to obtain SM data at large scales; however, little has been reported on the comparison of the data fusion methods in different categories. Here, we compared the performance of two widely used data fusion methods, the Ensemble Kalman Filter (EnKF) and the Back-Propagation Artificial Neural Network (BPANN), in the degraded grassland site (DGS) and the alpine grassland site (AGS). The SM data from the Community Land Model 5.0 (CLM5.0) and the Soil Moisture Active and Passive (SMAP) were fused and validated against the observations of the Cosmic-Ray Neutron Sensor (CRNS) to avoid the impacts of scale-mismatch. Results show that compared with the original data sets at both sites, the RMSE of the fused data by BPANN (FD-BPANN) and EnKF (FD-EnKF) had improved by more than 50% and 31%, respectively. Overall, the FD-BPANN performs better than the FD-EnKF because the BPANN method assigned higher weights to input data with better performance and the EnKF method is affected by the strong variabilities of both the fused CLM5.0 and SMAP data and the CRNS data. However, in terms of the percentile range, the FD-BPANN showed the worst performance, with overestimations in the low SM range of 25th percentile (<Q25), because the BPANN method tends to be trapped in a local minimum. The BPANN method performed better in humid areas, then followed by semi-humid areas, and finally arid and semi-arid areas. Moreover, compared with the previous studies in arid and semi-arid areas, the BPANN method in this study performed better.