This study leverages explainable machine learning, specifically
XGBoost models with Shapley Additive Explanations (SHAP), to explore
the chemical properties of atmospheric aerosols in Seoul, Korea, during
the summer of 2019. Focusing on non-refractory particulate matter
(NR-PM1) properties measured by high-resolution time-of-flight
aerosol mass spectrometry (HR-ToF-AMS), the research extends to organic
aerosol (OA) sources identified via positive matrix factorization
of high-resolution MS data. The models achieved good predictive accuracy
(R
2 > 0.90) for all species concentrations,
except for hydrocarbon-like OA (HOA) due to frequent concentration
fluctuations. The model outcomes aligned well with those previously
achieved using conventional methods (chemical transport model and
correlational analysis), confirming that relative humidity is associated
with nocturnal nitrate concentration and photochemistry associated
with sulfate concentration in the summertime in Seoul. Importantly,
the models revealed mostly nonlinear relationships between atmospheric
factors, such as temperature and particulate matter (PM) components,
thereby deepening the understanding of formation processes. Notably,
different potential formation mechanisms were discerned for more oxidized
oxygenated OA (MO-OOA) and oxidized primary OA (OPOA). For MO-OOA,
SHAP analysis showed a plateau in SHAP values at an O
x
concentration of 0.085 ppm, which suggested potential
fragmentation from further oxidation and agreed with previous chamber
experiments. Conversely, the lack of a plateau in the O
x
values for OPOA implied potential ongoing oxidation,
suggesting a higher and longer atmospheric oxidation potential. This
approach offers rapid and potential insights into complex atmospheric
aerosol formation processes. It is essential to acknowledge that SHAP
values do not establish causality, and knowledge of the underlying
physical and chemical processes was required to conclude valid and
comprehensive interpretations of the ML results.