Apparent quantum yields (Φ) of photochemically produced reactive intermediates (PPRIs) formed by dissolved organic matter (DOM) are vital to element cycles and contaminant fates in surface water. Simultaneous determination of Φ PPRI values from numerous water samples through existing experimental methods is time consuming and ineffective. Herein, machine learning models were developed with a systematic data set including 1329 data points to predict the values of three Φ PPRIs (Φ 3DOM* , Φ 1O2 , and Φ •OH ) based on DOM spectral parameters, experimental conditions, and calculation parameters. The best predictive performances for Φ 3DOM* , Φ 1O2 , and Φ •OH were achieved using the CatBoost model, which outperformed the traditional linear regression models. The significances of the wavelength range and spectral parameters on the three Φ PPRI predictions were revealed, suggesting that DOM with lower molecular weight, lower aromatic content, and a more autochthonous portion possessed higher Φ PPRIs . Chain models were constructed by adding the predicted Φ 3DOM* as a new feature into the Φ 1O2 and Φ •OH models, which consequently improved the predictive performance of Φ 1O2 but worsened the Φ •OH prediction likely due to the complex formation pathways of •OH. Overall, this study offered robust Φ PPRI prediction across interlaboratory differences and provided new insights into the relationship between PPRIs formation and DOM properties.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.