This study introduces a novel hybrid model combining Bayesian Stochastic Partial Differential Equations (SPDE) with deep learning, specifically Convolutional Neural Networks (CNN) and Deep Feedforward Neural Networks (DFFNN), to predict PM2.5 concentrations. Traditional models often fail to account for non-linear relationships and complex spatial dependencies, critical in urban settings. By integrating SPDE’s spatial-temporal structure with neural networks’ capacity for non-linearity, our model significantly outperforms standalone methods. Accurately predicting air pollution supports sustainable public health strategies and targeted interventions, which are critical for mitigating the adverse health effects of PM2.5, particularly in urban areas heavily impacted by climate change. The hybrid model was applied to the Pleasant Run Airshed in Indianapolis, Indiana, utilizing a comprehensive dataset that included PM2.5 sensor data, meteorological variables, and land-use information. By combining SPDE’s ability to model spatial-temporal structures with the adaptive power of neural networks, the model achieved a high level of predictive accuracy, significantly outperforming standalone methods. Additionally, the model’s interpretability was enhanced through the use of SHAP (Shapley Additive Explanations) values, which provided insights into the contribution of each variable to the model’s predictions. This framework holds the potential for improving air quality monitoring and supports more targeted public health interventions and policy-making efforts.