Traditional methods for analyzing
the biogenic and fossil
carbon
shares in solid waste are time-consuming and labor-intensive. A novel
approach was developed to directly classify the carbon group and predict
carbon content using the hyperspectral imaging (HSI) spectra of solid
waste in conjunction with state-of-the-art tree-based machine learning
models, including random forest (RF), extreme gradient boost, and
light gradient boost machine (LGBM). All of the classifiers and regressors
were able to achieve an accuracy above 0.95 and an R
2 of 0.96 in the test set, respectively. In addition,
two model interpretation approaches, the Shapley additive explanation
and model explainer, were applied. The results showed that the predictions
of the developed models were based on a reasonable understanding of
the overtone and shake of the functional groups (C–H, N–H,
and O–H). Furthermore, the developed models were validated
by an external test set, which did not overlap with the data used
for model construction. The RF and LGBM showed robust performance
with a 0.790 accuracy for carbon group classification and a 0.806 R
2 for carbon content prediction. Overall, the
optimal models provided a rapid method for characterizing the biogenic
carbon share in solid waste based on raw HSI spectra without preprocessing.