2017
DOI: 10.3390/app7070708
|View full text |Cite
|
Sign up to set email alerts
|

Potential Model Overfitting in Predicting Soil Carbon Content by Visible and Near-Infrared Spectroscopy

Abstract: Soil spectroscopy is known as a rapid and cost-effective method for predicting soil properties from spectral data. The objective of this work was to build a statistical model to predict soil carbon content from spectral data by partial least squares regression using a limited number of soil samples. Soil samples were collected from two soil orders (Andisol and Ultisol), where the dominant land cover is native Nothofagus forest. Total carbon was analyzed in the laboratory and samples were scanned using a spectr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 10 publications
(4 citation statements)
references
References 45 publications
0
4
0
Order By: Relevance
“…The hyper-parameter optimisation and calibration of the model was done through leave-one-out cross-validation (LOOCV) 39 . For the calibration dataset of n = 74 samples, LOOCV means that n-1 samples are used to calibrate the model and 1 sample is used to assess the accuracy; this is repeated n times for each single sample in the calibration dataset 40 .…”
Section: Methodsmentioning
confidence: 99%
“…The hyper-parameter optimisation and calibration of the model was done through leave-one-out cross-validation (LOOCV) 39 . For the calibration dataset of n = 74 samples, LOOCV means that n-1 samples are used to calibrate the model and 1 sample is used to assess the accuracy; this is repeated n times for each single sample in the calibration dataset 40 .…”
Section: Methodsmentioning
confidence: 99%
“…Diffuse reflectance spectra are particularly sensitive to soil particle size, shape and aggregation, with the result that prediction accuracy of soil properties is very sensitive to soil particle fineness, sample surface roughness and soil moisture (Chang et al, 2001;Liu et al, 2020;Manage et al, 2018;Nduwamungu et al, 2009;Reeves, 2010;Sun, 2021;Tekin et al, 2012). Not surprisingly then, soil properties such as texture, C and N content, CEC and pH have been estimated by NIR and VNIR with widely varying degrees of success (Gates, 2018;Nduwamungu et al, 2009;Reyna et al, 2017;Viscarra Rossel et al, 2006). Critical variables contributing to large errors in the calibration of the method are the geographic range of soils included in the calibration step, sample preparation (including sieving, grinding and drying) and the spectral preprocessing and calibration procedures used.…”
Section: Basic Features Of Nir Vnir and Mir Reflectance Spectramentioning
confidence: 99%
“…Wadoux et al (2020) further demonstrated this concept by introducing meaningless pseudo-covariates into a model, thereby enabling the accurate prediction of soil organic carbon, and concluded that pattern recognition models based on machine learning are not reliable ways to obtain knowledge about soil properties. In another example, Reyna et al (2017) concluded from their modelling of the VNIR reflectance spectra to estimate the organic carbon content of the soil that the predictive model was not robust and was in fact overfitting and misinterpreting the data. Thus, the spectral data could only provide descriptive, not quantitative, information about soil organic carbon.…”
Section: 'Black Box' Model Predictions and Overfitting Of Datamentioning
confidence: 99%
“…Although there is no formal rule for the choice of k, 10-fold or 5fold cross validation is commonly used. Leave-one-out cross validation is a special form of k-fold cross validation in which each iteration only has one sample, and this method is typically used in studies with limited soil samples (e.g., <100 samples, Chang and Laird, 2002;Du et al, 2009;Wang et al, 2014;Reyna et al, 2017;Amin et al, 2020). Currently, there is no consensus on which validation strategy is more suitable and robust in evaluating the performance of spectroscopic models at different scales.…”
Section: Introductionmentioning
confidence: 99%