Abstract. Rice is the most important staple food in Asia. However,
high-spatiotemporal-resolution rice yield datasets are limited over this
large region. The lack of such products greatly hinders studies that are
aimed at accurately assessing the impacts of climate change and simulating
agricultural production. Based on annual rice maps in Asia, we incorporated
multisource predictors into three machine learning (ML) models to generate
a high-spatial-resolution (4 km) seasonal rice yield dataset
(AsiaRiceYield4km) for the 1995–2015 period. Predictors were divided into four
categories that considered the most comprehensive rice growth conditions, and
the optimal ML model was determined based on an inverse probability weighting method. The results showed that AsiaRiceYield4km achieves good accuracy for
seasonal rice yield estimation (single rice: R2=0.88, RMSE = 920 kg ha−1; double rice: R2=0.91, RMSE = 554 kg ha−1; and triple rice:
R2=0.93, RMSE = 588 kg ha−1). Compared with single rice from the Spatial
Production Allocation Model (SPAM), the R2 of AsiaRiceYield4km was
improved by 0.20, and the RMSE was reduced by 618 kg ha−1 on average. In particular,
constant environmental conditions, including longitude, latitude, elevation
and soil properties, contributed the most (∼ 45 %) to rice
yield estimation. For different rice growth periods, we found that the
predictors of the reproductive period had greater impacts on rice yield
prediction than those of the vegetative period and the whole growing period.
AsiaRiceYield4km is a novel long-term gridded rice yield dataset that can
fill the unavailability of high-spatial-resolution seasonal yield products
across major rice production areas and promote more relevant studies on
agricultural sustainability worldwide. AsiaRiceYield4km can be downloaded
from the following open-access data repository:
https://doi.org/10.5281/zenodo.6901968 (Wu et al., 2022).