Foreknowledge about sugarcane crop size can help industry members make more informed decisions. There exists many different combinations of climate variables, seasonal climate prediction indices, and crop model outputs that could prove useful in explaining sugarcane crop size. A data mining method like random forests can cope with generating a prediction model when the search space of predictor variables is large. Research that has investigated the accuracy of random forests to explain annual variation in sugarcane productivity and the suitability of predictor variables generated from crop models coupled with observed climate and seasonal climate prediction indices is limited. Simulated biomass from the APSIM (Agricultural Production Systems sIMulator) sugarcane crop model, seasonal climate prediction indices and observed rainfall, maximum and minimum temperature, and radiation were supplied as inputs to a random forest classifier and a random forest regression model to explain annual variation in regional sugarcane yields at Tully, in northeastern Australia. Prediction models were generated on 1 September in the year before harvest, and then on 1 January and 1 March in the year of harvest, which typically runs from June to November. Our results indicated that in 86.36 % of years, it was possible to determine as early as September in the year before harvest if production would be above the median. This accuracy improved to 95.45 % by January in the year of harvest. The R-squared of the random forest regression model gradually improved from 66.76 to 79.21 % from September in the year before harvest through to March in the same year of harvest. All three sets of variables-(i) simulated biomass indices, (ii) observed climate, and (iii) seasonal climate prediction indices-were typically featured in the models at various stages. Better crop predictions allows farmers to improve their nitrogen management to meet the demands of the new crop, mill managers could better plan the mill's labor requirements and maintenance scheduling activities, and marketers can more confidently manage the forward sale and storage of the crop. Hence, accurate yield forecasts can improve industry sustainability by delivering better environmental and economic outcomes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.