Search citation statements
Paper Sections
Citation Types
Year Published
Publication Types
Relationship
Authors
Journals
There has been a recent surge in the number of studies that aim to model crop yield using data-driven approaches. This has largely come about due to the increasing amounts of remote sensing (e.g. satellite imagery) and precision agriculture data available (e.g. high-resolution crop yield monitor data), as well as the abundance of machine learning modelling approaches. However, there are several common issues in published studies in the field of precision agriculture (PA) that must be addressed. This includes the terminology used in relation to crop yield modelling, predicting, forecasting, and interpolating, as well as the way that models are calibrated and validated. As a typical example, many studies will take a crop yield map or several plots within a field from a single season, build a model with satellite or Unmanned Aerial Vehicle (UAV) imagery, validate using data-splitting or some kind of cross-validation (e.g. k-fold), and say that it is a ‘prediction’ or ‘forecast’ of crop yield. However, this poses a problem as the approach is not testing the forecasting ability of the model, as it is built on the same season that it is then validating with, thus giving a substantial overestimation of the value for decision-making, such as an application of fertiliser in-season. This is an all-too-common flaw in the logic construct of many published studies. Moving forward, it is essential that clear definitions and guidelines for data-driven yield modelling and validation are outlined so that there is a greater connection between the goal of the study, and the actual study outputs/outcomes. To demonstrate this, the current study uses a case study dataset from a collection of large neighbouring farms in New South Wales, Australia. The dataset includes 160 yield maps of winter wheat (Triticum aestivum) covering 26,400 hectares over a 10-year period (2014–2023). Machine learning crop yield models are built at 30 m spatial resolution with a suite of predictor data layers that relate to crop yield. This includes datasets that represent soil variation, terrain, weather, and satellite imagery of the crop. Predictions are made at both the within-field (30 m), and field resolution. Crop yield predictions are useful for an array of applications, so four different experiments were set up to reflect different scenarios. This included Experiment 1: forecasting yield mid-season (e.g. for mid-season fertilisation), Experiment 2: forecasting yield late-season (e.g. for late-season logistics/forward selling), Experiment 3: predicting yield in a previous season for a field with no yield data in a season, and Experiment 4: predicting yield in a previous season for a field with some yield data (e.g. two combine harvesters, but only one was fitted with a yield monitor). This study showcases how different model calibration and validation approaches clearly impact prediction quality, and therefore how they should be interpreted in data-driven crop yield modelling studies. This is key for ensuring that the wealth of data-driven crop yield modelling studies not only contribute to the science, but also deliver actual value to growers, industry, and governments.
There has been a recent surge in the number of studies that aim to model crop yield using data-driven approaches. This has largely come about due to the increasing amounts of remote sensing (e.g. satellite imagery) and precision agriculture data available (e.g. high-resolution crop yield monitor data), as well as the abundance of machine learning modelling approaches. However, there are several common issues in published studies in the field of precision agriculture (PA) that must be addressed. This includes the terminology used in relation to crop yield modelling, predicting, forecasting, and interpolating, as well as the way that models are calibrated and validated. As a typical example, many studies will take a crop yield map or several plots within a field from a single season, build a model with satellite or Unmanned Aerial Vehicle (UAV) imagery, validate using data-splitting or some kind of cross-validation (e.g. k-fold), and say that it is a ‘prediction’ or ‘forecast’ of crop yield. However, this poses a problem as the approach is not testing the forecasting ability of the model, as it is built on the same season that it is then validating with, thus giving a substantial overestimation of the value for decision-making, such as an application of fertiliser in-season. This is an all-too-common flaw in the logic construct of many published studies. Moving forward, it is essential that clear definitions and guidelines for data-driven yield modelling and validation are outlined so that there is a greater connection between the goal of the study, and the actual study outputs/outcomes. To demonstrate this, the current study uses a case study dataset from a collection of large neighbouring farms in New South Wales, Australia. The dataset includes 160 yield maps of winter wheat (Triticum aestivum) covering 26,400 hectares over a 10-year period (2014–2023). Machine learning crop yield models are built at 30 m spatial resolution with a suite of predictor data layers that relate to crop yield. This includes datasets that represent soil variation, terrain, weather, and satellite imagery of the crop. Predictions are made at both the within-field (30 m), and field resolution. Crop yield predictions are useful for an array of applications, so four different experiments were set up to reflect different scenarios. This included Experiment 1: forecasting yield mid-season (e.g. for mid-season fertilisation), Experiment 2: forecasting yield late-season (e.g. for late-season logistics/forward selling), Experiment 3: predicting yield in a previous season for a field with no yield data in a season, and Experiment 4: predicting yield in a previous season for a field with some yield data (e.g. two combine harvesters, but only one was fitted with a yield monitor). This study showcases how different model calibration and validation approaches clearly impact prediction quality, and therefore how they should be interpreted in data-driven crop yield modelling studies. This is key for ensuring that the wealth of data-driven crop yield modelling studies not only contribute to the science, but also deliver actual value to growers, industry, and governments.
Senescence is a highly ordered biological process involving resource redistribution away from aging tissues that affects yield and quality in annuals and perennials. Images from 14 unmanned / unoccupied / uncrewed aerial system (UAS, UAV, drone) flights captured the senescence window across two experiments while functional principal component analysis (FPCA) effectively reduced the dimensionality of temporal visual senescence ratings (VSRs) and two vegetation indices: the red chromatic coordinate index (RCC) and transformed normalized difference green and red index (TNDGR). Convolutional neural networks (CNNs) trained on temporally concatenated, or “sandwiched,” UAS images of individual cotton plants (Gossypium hirsutum L.), allowed single-plant analysis (SPA). The first functional principal component scores (FPC1) served as the regression target across six CNN models (M1-M6). Model performance was strongest for FPC1 scores from VSRs (R2 = 0.857 and 0.886 for M1 and M4), strong for TNDGR (R2 = 0.743 and 0.745 for M3 and M6), and strong-to-moderate for RCC (R2 = 0.619 and 0.435 for M2 and M5), with deep learning attention of each model confirmed by activation of plant pixels within saliency maps. Single-plant UAS image analysis across time enabled translatable implementations of high-throughput phenotyping by linking deep learning with functional data analysis (FDA). This has applications for fundamental plant biology, monitoring orchards or other spaced plantings, plant breeding, and genetic research.
SummarySenescence is a highly ordered degenerative biological process that affects yield and quality in annuals and perennials. Images from 14 unoccupied aerial system (UAS, UAV, drone) flights captured the senescence window across two experiments while functional principal component analysis (FPCA) effectively reduced the dimensionality of temporal visual senescence ratings (VSRs) and two vegetation indices: RCC and TNDGR.Convolutional neural networks (CNNs) trained on temporally concatenated, or “sandwiched,” UAS images of individual cotton plants (Gossypium hirsutumL.), allowed single-plant analysis (SPA). The first functional principal component scores (FPC1) served as the regression target across six CNN models (M1-M6).Model performance was strongest for FPC1 scores from VSR (R2= 0.857 and 0.886 for M1 and M4), strong for TNDGR (R2= 0.743 and 0.745 for M3 and M6), and strong-to- moderate for RCC (R2= 0.619 and 0.435 for M2 and M5), with deep learning attention of each model confirmed by activation of plant pixels within saliency maps.Single-plant UAS image analysis across time enabled translatable implementations of high-throughput phenotyping by linking deep learning with functional data analysis (FDA). This has applications for fundamental plant biology, monitoring orchards or other spaced plantings, plant breeding, and genetic research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.