The Effect of Bioclimatic Covariates on Ensemble Machine Learning Prediction of Total Soil Carbon in the Pannonian Biogeoregion

Radočaj, Dorijan; Jurišić, Mladen; Tadić, Vjekoslav

doi:10.3390/agronomy13102516

Cited by 3 publications

(2 citation statements)

References 49 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, while several broader studies generally agreed on the effectiveness of ensemble machine learning, they also provided mixed observations regarding its robustness relative to individual methods. These studies noted the dependence of their prediction accuracy on the characteristics of the input samples [25,47] and the prediction principles used by the individual methods in the ensemble [26]. XGB was shown to be a superior prediction method to RF and SVM, demonstrating robustness and resistance to overfitting as shown by the comprehensive leave-one-out cross-validation approach [48].…”

Section: Resultsmentioning

confidence: 99%

Influence of Thermal Pretreatment on Lignin Destabilization in Harvest Residues: An Ensemble Machine Learning Approach

Kovačić,

Radočaj,

Samac

et al. 2024

AgriEngineering

Self Cite

View full text Add to dashboard Cite

The research on lignocellulose pretreatments is generally performed through experiments that require substantial resources, are often time-consuming and are not always environmentally friendly. Therefore, researchers are developing computational methods which can minimize experimental procedures and save money. In this research, three machine learning methods, including Random Forest (RF), Extreme Gradient Boosting (XGB) and Support Vector Machine (SVM), as well as their ensembles were evaluated to predict acid-insoluble detergent lignin (AIDL) content in lignocellulose biomass. Three different types of harvest residue (maize stover, soybean straw and sunflower stalk) were first pretreated in a laboratory oven with hot air under two different temperatures (121 and 175 °C) at different duration (30 and 90 min) with the aim of disintegration of the lignocellulosic structure, i.e., delignification. Based on the leave-one-out cross-validation, the XGB resulted in the highest accuracy for all individual harvest residues, achieving the coefficient of determination (R2) in the range of 0.756–0.980. The relative variable importances for all individual harvest residues strongly suggested the dominant impact of pretreatment temperature in comparison to its duration. These findings proved the effectiveness of machine learning prediction in the optimization of lignocellulose pretreatment, leading to a more efficient lignin destabilization approach.

show abstract

Section: Resultsmentioning

confidence: 99%

Influence of Thermal Pretreatment on Lignin Destabilization in Harvest Residues: An Ensemble Machine Learning Approach

Kovačić,

Radočaj,

Samac

et al. 2024

AgriEngineering

Self Cite

View full text Add to dashboard Cite

show abstract

“…Two machine learning methods, Random Forest (RF) and Extreme Gradient Boosting (XGB), were evaluated alongside DNN. RF and XGB achieved superior prediction accuracy in regression problems compared to current machine learning algorithms in similar studies on various aspects of horticulture [58,59] and agriculture in general [60][61][62]. As an ensemble learning technique, RF builds a forest of decision trees, each trained separately on randomly selected samples of the data and features [63].…”

Section: Deep and Machine Learning Prediction And Accuracy Assessmentmentioning

confidence: 99%

Indoor Plant Soil-Plant Analysis Development (SPAD) Prediction Based on Multispectral Indices and Soil Electroconductivity: A Deep Learning Approach

Radočaj,

Rapčan,

Jurišić

2023

Horticulturae

Self Cite

View full text Add to dashboard Cite

Leaf Soil-Plant Analysis Development (SPAD) prediction is a crucial measure of plant health and is essential for optimizing indoor plant management. The deep learning methods offer advanced tools for precise evaluations but their adaptation to the heterogeneous indoor plant ecosystem presents distinct challenges. This study assesses how accurately deep neural network (DNN) predicts SPAD values in leaves on indoor plants when compared to well-established machine learning techniques, including Random Forest (RF) and Extreme Gradient Boosting (XGB). The covariates for prediction were based on low-cost multispectral and soil electro-conductivity (EC) sensors, enabling a non-destructive sensing approach. The study also strongly emphasized multicollinearity analysis quantified by the Variance Inflation Factor (VIF) and two independent indices, as well as its effect on prediction accuracy using deep and machine learning methods. DNN resulted in higher accuracy to RF and XGB, also performing better using filtered data after multicollinearity analysis based on the coefficient of determination (R2), root mean square error (RMSE) and mean absolute error (MAE) (R2 = 0.589, RMSE = 11.68, MAE = 9.52) in comparison to using all input covariates (R2 = 0.476, RMSE = 12.90, MAE = 10.94). Overall, DNN was proven as a more accurate prediction method than the conventional machine learning approach for the prediction of leaf SPAD values in indoor plants, despite using heterogenous plant types and input covariates.

show abstract

A Comprehensive Evaluation of Machine Learning Algorithms for Digital Soil Organic Carbon Mapping on a National Scale

Radočaj,

Jug,

Jug

et al. 2024

Applied Sciences

View full text Add to dashboard Cite

The aim of this study was to narrow the research gap of ambiguity in which machine learning algorithms should be selected for evaluation in digital soil organic carbon (SOC) mapping. This was performed by providing a comprehensive assessment of prediction accuracy for 15 frequently used machine learning algorithms in digital SOC mapping based on studies indexed in the Web of Science Core Collection (WoSCC), providing a basis for algorithm selection in future studies. Two study areas, including mainland France and the Czech Republic, were used in the study based on 2514 and 400 soil samples from the LUCAS 2018 dataset. Random Forest was first ranked for France (mainland) and then ranked for the Czech Republic regarding prediction accuracy; the coefficients of determination were 0.411 and 0.249, respectively, which was in accordance with its dominant appearance in previous studies indexed in the WoSCC. Additionally, the K-Nearest Neighbors and Gradient Boosting Machine regression algorithms indicated, relative to their frequency in studies indexed in the WoSCC, that they are underrated and should be more frequently considered in future digital SOC studies. Future studies should consider study areas not strictly related to human-made administrative borders, as well as more interpretable machine learning and ensemble machine learning approaches.

show abstract

The Effect of Bioclimatic Covariates on Ensemble Machine Learning Prediction of Total Soil Carbon in the Pannonian Biogeoregion

Cited by 3 publications

References 49 publications

Influence of Thermal Pretreatment on Lignin Destabilization in Harvest Residues: An Ensemble Machine Learning Approach

Influence of Thermal Pretreatment on Lignin Destabilization in Harvest Residues: An Ensemble Machine Learning Approach

Indoor Plant Soil-Plant Analysis Development (SPAD) Prediction Based on Multispectral Indices and Soil Electroconductivity: A Deep Learning Approach

A Comprehensive Evaluation of Machine Learning Algorithms for Digital Soil Organic Carbon Mapping on a National Scale

Contact Info

Product

Resources

About