Recent substantial advances in high-throughput field phenotyping have provided plant breeders with affordable and efficient tools for evaluating a large number of genotypes for important agronomic traits at early growth stages. Nevertheless, the implementation of large datasets generated by high-throughput phenotyping tools such as hyperspectral reflectance in cultivar development programs is still challenging due to the essential need for intensive knowledge in computational and statistical analyses. In this study, the robustness of three common machine learning (ML) algorithms, multilayer perceptron (MLP), support vector machine (SVM), and random forest (RF), were evaluated for predicting soybean (Glycine max) seed yield using hyperspectral reflectance. For this aim, the hyperspectral reflectance data for the whole spectra ranged from 395 to 1005 nm, which were collected at the R4 and R5 growth stages on 250 soybean genotypes grown in four environments. The recursive feature elimination (RFE) approach was performed to reduce the dimensionality of the hyperspectral reflectance data and select variables with the largest importance values. The results indicated that R5 is more informative stage for measuring hyperspectral reflectance to predict seed yields. The 395 nm reflectance band was also identified as the high ranked band in predicting the soybean seed yield. By considering either full or selected variables as the input variables, the ML algorithms were evaluated individually and combined-version using the ensemble–stacking (E–S) method to predict the soybean yield. The RF algorithm had the highest performance with a value of 84% yield classification accuracy among all the individual tested algorithms. Therefore, by selecting RF as the metaClassifier for E–S method, the prediction accuracy increased to 0.93, using all variables, and 0.87, using selected variables showing the success of using E–S as one of the ensemble techniques. This study demonstrated that soybean breeders could implement E–S algorithm using either the full or selected spectra reflectance to select the high-yielding soybean genotypes, among a large number of genotypes, at early growth stages.
Prominent yellow flowers that are present in a Brassica oilseed crop such as canola require careful consideration when selecting a spectral index for yield estimation. This study evaluated spectral indices for multispectral sensors that correlate with the seed yield of Brassica oilseed crops. A small-plot experiment was conducted near Pendleton, Oregon in which spring canola was grown under varying water regimes and nitrogen treatments to create a wide range in oilseed yield. Plot measurements consisted of canopy reflectance at flowering using a hand-held spectroradiometer and seed yield at physiological maturity. Spectroradiometric measurements were converted to MODIS band equivalent reflectance. Selected indices were computed from spectra obtained with the radiometer and correlated with seed yield. A normalized difference yellowness index (NDYI), computed from the green and blue wavebands, overcame limitations of the normalized difference vegetation index (NDVI) during flowering and best modeled variability in relative yield potential. NDYI was more linear and correlated with county-wide oilseed yield data and MODIS satellite data from North Dakota (r 2 ≤ 0.72) than NDVI (r 2 ≤ 0.66). NDYI only requires wavebands in the visible region of the spectrum and can be applied to any satellite or aerial sensor that has blue and green channels. These findings highlight the benefit of using a spectral index that is sensitive to reproductive growth of vegetation instead of vegetative growth for crops with spectrally prominent reproductive canopy elements. Our results indicate that NDYI is a better indicator of yield potential than NDVI during mid-season development stages, especially peak flowering.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.