Milan Kilibarda scite author profile

Soil property and class maps for the continent of Africa were so far only available at very generalised scales, with many countries not mapped at all. Thanks to an increasing quantity and availability of soil samples collected at field point locations by various government and/or NGO funded projects, it is now possible to produce detailed pan-African maps of soil nutrients, including micro-nutrients at fine spatial resolutions. In this paper we describe production of a 30 m resolution Soil Information System of the African continent using, to date, the most comprehensive compilation of soil samples ($$N \approx 150,000$$ N ≈ 150 , 000 ) and Earth Observation data. We produced predictions for soil pH, organic carbon (C) and total nitrogen (N), total carbon, effective Cation Exchange Capacity (eCEC), extractable—phosphorus (P), potassium (K), calcium (Ca), magnesium (Mg), sulfur (S), sodium (Na), iron (Fe), zinc (Zn)—silt, clay and sand, stone content, bulk density and depth to bedrock, at three depths (0, 20 and 50 cm) and using 2-scale 3D Ensemble Machine Learning framework implemented in the (Machine Learning in ) package. As covariate layers we used 250 m resolution (MODIS, PROBA-V and SM2RAIN products), and 30 m resolution (Sentinel-2, Landsat and DTM derivatives) images. Our fivefold spatial Cross-Validation results showed varying accuracy levels ranging from the best performing soil pH (CCC = 0.900) to more poorly predictable extractable phosphorus (CCC = 0.654) and sulphur (CCC = 0.708) and depth to bedrock. Sentinel-2 bands SWIR (B11, B12), NIR (B09, B8A), Landsat SWIR bands, and vertical depth derived from 30 m resolution DTM, were the overall most important 30 m resolution covariates. Climatic data images—SM2RAIN, bioclimatic variables and MODIS Land Surface Temperature—however, remained as the overall most important variables for predicting soil chemical variables at continental scale. This publicly available 30-m Soil Information System of Africa aims at supporting numerous applications, including soil and fertilizer policies and investments, agronomic advice to close yield gaps, environmental programs, or targeting of nutrition interventions.

show abstract

African Soil Properties and Nutrients Mapped at 30--m Spatial Resolution using Two-scale Ensemble Machine Learning

Hengl

Miller²,

Krizan

et al. 2020

Preprint

View full text Add to dashboard Cite

Soil property and class maps for the continent of Africa were so far only available at very generalised scales, with many countries not mappedat all. Thanks to an increasing quantity and availability of soil samples collected at field point locations by various government and/or NGOfunded projects, it is now possible to produce detailed pan-African maps of soil nutrients, including micro-nutrients at fine spatial resolutions. Inthis paper we describe production of a 30 m resolution Soil Information System of the African continent using, to date, the most comprehensivecompilation of soil samples (N ≈ 150, 000) and Earth Observation data. We produced predictions for soil pH, organic carbon (C) and totalnitrogen (N), total carbon, Cation Exchange Capacity (eCEC), extractable — phosphorus (P), potassium (K), calcium (Ca), magnesium (Mg),sulfur (S), sodium (Na), iron (Fe), zinc (Zn) — silt, clay and sand, stone content, bulk density and depth to bedrock, at three depths (0, 20 and50 cm) and using 2-scale 3D Ensemble Machine Learning framework implemented in the mlr (Machine Learning in R) package. As covariatelayers we used 250 m resolution (MODIS, PROBA-V and SM2RAIN products), and 30 m resolution (Sentinel-2, Landsat and DTM derivatives)images. Our 5–fold spatial Cross-Validation results showed varying accuracy levels ranging from the best performing soil pH (CCC=0.900) tomore poorly predictable extractable phosphorus (CCC=0.654) and sulphur (CCC=0.708) and depth to bedrock. Sentinel-2 bands SWIR (B11,B12), NIR (B09, B8A), Landsat SWIR bands, and vertical depth derived from 30 m resolution DTM, were the overall most important 30 mresolution covariates. Climatic data images — SM2RAIN, bioclimatic variables and MODIS Land Surface Temperature — however, remainedas the overall most important variables for predicting soil chemical variables at continental scale. The publicly available 30–m soil maps aresuitable for numerous applications, including soil and fertilizer policies and investments, agronomic advice to close yield gaps, environmentalprograms, or targeting of nutrition interventions.

show abstract

A spatiotemporal ensemble machine learning framework for generating land use/land cover time-series maps for Europe (2000–2019) based on LUCAS, CORINE and GLAD Landsat

Witjes¹,

Parente²,

Diemen³

et al. 2022

View full text Add to dashboard Cite

A spatiotemporal machine learning framework for automated prediction and analysis of long-term Land Use/Land Cover dynamics is presented. The framework includes: (1) harmonization and preprocessing of spatial and spatiotemporal input datasets (GLAD Landsat, NPP/VIIRS) including five million harmonized LUCAS and CORINE Land Cover-derived training samples, (2) model building based on spatial k-fold cross-validation and hyper-parameter optimization, (3) prediction of the most probable class, class probabilities and model variance of predicted probabilities per pixel, (4) LULC change analysis on time-series of produced maps. The spatiotemporal ensemble model consists of a random forest, gradient boosted tree classifier, and an artificial neural network, with a logistic regressor as meta-learner. The results show that the most important variables for mapping LULC in Europe are: seasonal aggregates of Landsat green and near-infrared bands, multiple Landsat-derived spectral indices, long-term surface water probability, and elevation. Spatial cross-validation of the model indicates consistent performance across multiple years with overall accuracy (a weighted F1-score) of 0.49, 0.63, and 0.83 when predicting 43 (level-3), 14 (level-2), and five classes (level-1). Additional experiments show that spatiotemporal models generalize better to unknown years, outperforming single-year models on known-year classification by 2.7% and unknown-year classification by 3.5%. Results of the accuracy assessment using 48,365 independent test samples shows 87% match with the validation points. Results of time-series analysis (time-series of LULC probabilities and NDVI images) suggest forest loss in large parts of Sweden, the Alps, and Scotland. Positive and negative trends in NDVI in general match the land degradation and land restoration classes, with “urbanization” showing the most negative NDVI trend. An advantage of using spatiotemporal ML is that the fitted model can be used to predict LULC in years that were not included in its training dataset, allowing generalization to past and future periods, e.g. to predict LULC for years prior to 2000 and beyond 2020. The generated LULC time-series data stack (ODSE-LULC), including the training points, is publicly available via the ODSE Viewer. Functions used to prepare data and run modeling are available via the eumap library for Python.

show abstract

A spatiotemporal ensemble machine learning framework for generating land use / land cover time-series maps for Europe (2000 – 2019) based on LUCAS, CORINE and GLAD Landsat

Witjes¹,

Parente²,

Diemen³

et al. 2022

Preprint

View full text Add to dashboard Cite

A seamless spatiotemporal machine learning framework for automated prediction and analysis of long-term Land Use / Land Cover dynamics is presented. The framework includes: (1) harmonization and preprocessing of high-resolution spatial and spatiotemporal input datasets (GLAD Landsat, NPP/VIIRS) including 5 million harmonized LUCAS and CORINE Land Cover-derived training samples, (2) model building based on spatial k-fold cross-validation and hyper-parameter optimization, (3) prediction of the most probable class, class probabilities and model variance of predicted probabilities per pixel, (4) LULC change analysis on time-series of produced maps. The spatiotemporal ensemble model consists of a random forest, gradient boosted tree classifier, and an artificial neural network, with a logistic regressor as meta-learner. The results show that the most important variables for mapping LULC in Europe are: seasonal aggregates of Landsat green and near-infrared bands, multiple Landsat-derived spectral indices, long-term surface water probability, and elevation. Spatial cross-validation of the model indicates consistent performance across multiple years with overall accuracy (a weighted F1-score) of 0.49, 0.63, and 0.83 when predicting 43 (level-3), 14 (level-2), and 5 classes (level-1). The spatiotemporal model outperforms spatial models on known-year classification by 2.7% and unknown-year classification by 3.5%. Results of the accuracy assessment using 48,365 independent test samples shows 87% match with the validation points. Results of time-series analysis (time-series of LULC probabilities and NDVI images) suggest forest loss in large parts of Sweden, the Alps, and Scotland.Positive and negative trends in NDVI in general match the land degradation and land restoration classes, with “urbanization” showing the most negative NDVI trend. An advantage of using spatiotemporal ML is that the fitted model can be used to predict LULC in years that were not included in its training dataset,allowing generalization to past and future periods, e.g. to predict LULC for years prior to 2000 and beyond 2020. The generated LULC time-series data stack (ODSE-LULC), including the training points, is publicly available via the ODSE Viewer. Functions used to prepare data and run modeling are available via the eumap library for python.

show abstract

A high-resolution daily gridded meteorological dataset for Serbia made by Random Forest Spatial Interpolation

et al. 2021

View full text Add to dashboard Cite

We produced the first daily gridded meteorological dataset at a 1-km spatial resolution across Serbia for 2000–2019, named MeteoSerbia1km. The dataset consists of five daily variables: maximum, minimum and mean temperature, mean sea-level pressure, and total precipitation. In addition to daily summaries, we produced monthly and annual summaries, and daily, monthly, and annual long-term means. Daily gridded data were interpolated using the Random Forest Spatial Interpolation methodology, based on using the nearest observations and distances to them as spatial covariates, together with environmental covariates to make a random forest model. The accuracy of the MeteoSerbia1km daily dataset was assessed using nested 5-fold leave-location-out cross-validation. All temperature variables and sea-level pressure showed high accuracy, although accuracy was lower for total precipitation, due to the discontinuity in its spatial distribution. MeteoSerbia1km was also compared with the E-OBS dataset with a coarser resolution: both datasets showed similar coarse-scale patterns for all daily meteorological variables, except for total precipitation. As a result of its high resolution, MeteoSerbia1km is suitable for further environmental analyses.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Milan Kilibarda

African soil properties and nutrients mapped at 30 m spatial resolution using two-scale ensemble machine learning

African Soil Properties and Nutrients Mapped at 30--m Spatial Resolution using Two-scale Ensemble Machine Learning

A spatiotemporal ensemble machine learning framework for generating land use/land cover time-series maps for Europe (2000–2019) based on LUCAS, CORINE and GLAD Landsat

A spatiotemporal ensemble machine learning framework for generating land use / land cover time-series maps for Europe (2000 – 2019) based on LUCAS, CORINE and GLAD Landsat

A high-resolution daily gridded meteorological dataset for Serbia made by Random Forest Spatial Interpolation

Contact Info

Product

Resources

About