Abstract. Soil moisture (SM) datasets are critical to understanding
the global water, energy, and biogeochemical cycles and benefit extensive
societal applications. However, individual sources of SM data (e.g., in situ
and satellite observations, reanalysis, offline land surface model
simulations, Earth system model – ESM – simulations) have source-specific
limitations and biases related to the spatiotemporal continuity,
resolutions, and modeling and retrieval assumptions. Here, we developed seven
global, gap-free, long-term (1970–2016), multilayer (0–10, 10–30,
30–50, and 50–100 cm) SM products at monthly 0.5∘ resolution
(available at https://doi.org/10.6084/m9.figshare.13661312.v1; Wang and Mao, 2021) by
synthesizing a wide range of SM datasets using three statistical methods
(unweighted averaging, optimal linear combination, and emergent constraint).
The merged products outperformed their source datasets when evaluated with
in situ observations (mean bias from −0.044 to 0.033 m3 m−3, root
mean square errors from 0.076 to 0.104 m3 m−3, Pearson
correlations from 0.35 to 0.67) and multiple gridded datasets that did not
enter merging because of insufficient spatial, temporal, or soil layer
coverage. Three of the new SM products, which were produced by applying any
of the three merging methods to the source datasets excluding the ESMs,
had lower bias and root mean square errors and higher correlations than the
ESM-dependent merged products. The ESM-independent products also showed a
better ability to capture historical large-scale drought events than the
ESM-dependent products. The merged products generally showed reasonable
temporal homogeneity and physically plausible global sensitivities to
observed meteorological factors, except that the ESM-dependent products
underestimated the low-frequency temporal variability in SM and
overestimated the high-frequency variability for the 50–100 cm depth.
Based on these evaluation results, the three ESM-independent products were
finally recommended for future applications because of their better
performances than the ESM-dependent ones. Despite uncertainties in the raw
SM datasets and fusion methods, these hybrid products create added value
over existing SM datasets because of the performance improvement and
harmonized spatial, temporal, and vertical coverages, and they provide a new
foundation for scientific investigation and resource management.