Traditional methods for analyzing
the biogenic and fossil
carbon
shares in solid waste are time-consuming and labor-intensive. A novel
approach was developed to directly classify the carbon group and predict
carbon content using the hyperspectral imaging (HSI) spectra of solid
waste in conjunction with state-of-the-art tree-based machine learning
models, including random forest (RF), extreme gradient boost, and
light gradient boost machine (LGBM). All of the classifiers and regressors
were able to achieve an accuracy above 0.95 and an R
2 of 0.96 in the test set, respectively. In addition,
two model interpretation approaches, the Shapley additive explanation
and model explainer, were applied. The results showed that the predictions
of the developed models were based on a reasonable understanding of
the overtone and shake of the functional groups (C–H, N–H,
and O–H). Furthermore, the developed models were validated
by an external test set, which did not overlap with the data used
for model construction. The RF and LGBM showed robust performance
with a 0.790 accuracy for carbon group classification and a 0.806 R
2 for carbon content prediction. Overall, the
optimal models provided a rapid method for characterizing the biogenic
carbon share in solid waste based on raw HSI spectra without preprocessing.
Insights into carbon sources (biogenic and fossil carbon) and contents in solid waste are vital for estimating the carbon emissions from incineration plants. However, the traditional methods are time-, labor-, and cost-intensive. Herein, high-quality data sets were established after analyzing the carbon contents and infrared spectra of substantial samples using elemental analysis and attenuated total reflectance-Fourier transform infrared spectroscopy (ATR-FTIR), respectively. Then, five classification and eight regression machine learning (ML) models were evaluated to recognize the proportion of biogenic and fossil carbon in solid waste. Using the optimized data preprocessing approach, the random forest (RF) classifier with hyperparameter tuning ranked first in classifying the carbon group with a test accuracy of 0.969, and the carbon contents were successfully predicted by the RF regressor with R 2 = 0.926 considering performance-interpretability-computation time competition. The above proposed algorithms were further validated with real environmental samples, which exhibited robust performance with an accuracy of 0.898 for carbon group classification and an R 2 value of 0.851 for carbon content prediction. The reliable results indicate that ATR-FTIR coupled with ML algorithms is feasible for rapidly identifying both carbon groups and content, facilitating the calculation and assessment of carbon emissions from solid waste incineration.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.