Statistical and machine learning methods  for evaluating trends in air quality under  changing meteorological conditions

Qiu, Minghao; Zigler, Corwin; Selin, Noelle E.

doi:10.5194/acp-22-10551-2022

Cited by 22 publications

(13 citation statements)

References 66 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As higher spatial and temporal densities of training data were predictive of increased model performance (Figure d), future inclusion of these data sources should improve the accuracy of estimates. Other features and data sources such as smoke plume height, spatial lags of meteorology, and indicators of atmospheric mixing such as air temperature at different vertical heights have been found in other settings to improve total PM 2.5 estimates, predict variation in the relationship between PM 2.5 and AOD, or have the potential to improve the model’s ability to identify when smoke mixes to the surface, something the current model occasionally struggles with, as evidenced by the range of predicted values on days with very low observed smoke PM 2.5 values. − Future advances could also include alternative machine learning models, such as convolutional neural networks, that take advantage of the spatial information instead of features at a single point and have been found to provide good performance on total PM 2.5 . While our estimates rely on plume boundaries drawn by NOAA analysts over the contiguous US, automation of plume identificationa task for which early computer vision work has shown promise − could allow for generalization of this approach to other geographic regions, an effort of increasing importance as wildfires grow in many parts of the world. ,− Finally, uncertainty quantification from machine learning models is an active area of research, and future improvements to these estimates could include more granular quantification of uncertainty.…”

Section: Discussionmentioning

confidence: 99%

Daily Local-Level Estimates of Ambient Wildfire Smoke PM_2.5 for the Contiguous US

Childs

Wen

et al. 2022

Environ. Sci. Technol.

Self Cite

115

109

View full text Add to dashboard Cite

Smoke from wildfires is a growing health risk across the US. Understanding the spatial and temporal patterns of such exposure and its population health impacts requires separating smoke-driven pollutants from non-smoke pollutants and a long time series to quantify patterns and measure health impacts. We develop a parsimonious and accurate machine learning model of daily wildfire-driven PM 2.5 concentrations using a combination of ground, satellite, and reanalysis data sources that are easy to update. We apply our model across the contiguous US from 2006 to 2020, generating daily estimates of smoke PM 2.5 over a 10 km-by-10 km grid and use these data to characterize levels and trends in smoke PM 2.5 . Smoke contributions to daily PM 2.5 concentrations have increased by up to 5 μg/m 3 in the Western US over the last decade, reversing decades of policy-driven improvements in overall air quality, with concentrations growing fastest for higher income populations and predominantly Hispanic populations. The number of people in locations with at least 1 day of smoke PM 2.5 above 100 μg/m 3 per year has increased 27-fold over the last decade, including nearly 25 million people in 2020 alone. Our data set can bolster efforts to comprehensively understand the drivers and societal impacts of trends and extremes in wildfire smoke.

show abstract

Section: Discussionmentioning

confidence: 99%

Daily Local-Level Estimates of Ambient Wildfire Smoke PM_2.5 for the Contiguous US

Childs

Wen

et al. 2022

Environ. Sci. Technol.

Self Cite

115

109

View full text Add to dashboard Cite

show abstract

“…Code and data availability. The GEOS-Chem simulation of different scenarios and the R scripts to implement the statistical methods to correct for meteorological variability are available at the following repository: https://doi.org/10.5281/zenodo.6857259 (Qiu et al, 2022). All the other data needed to evaluate the conclusions in the paper are present in the paper.…”

Section: Recommendations For Attributing Trends To Emission Changesmentioning

confidence: 99%

Statistical and machine learning methods for evaluating trends in air quality under changing meteorological conditions

Qiu

Zigler²,

Selin

2022

Preprint

Self Cite

View full text Add to dashboard Cite

Abstract. Evaluating the influence of anthropogenic emissions changes on air quality requires accounting for the influence of meteorological variability. Statistical methods such as multiple linear regression (MLR) models with basic meteorological variables are often used to remove meteorological variability and estimate trends in measured pollutant concentrations attributable to emissions changes. However, the ability of these widely-used statistical approaches to correct for meteorological variability remains unknown, limiting their usefulness in the real-world policy evaluations. Here, we quantify the performance of MLR and other quantitative methods using two scenarios simulated by a chemical transport model, GEOS-Chem, as a synthetic dataset. Focusing on the impacts of anthropogenic emissions changes in the US (2011 to 2017) and China (2013 to 2017) on PM2.5 and O3, we show that widely-used regression methods do not perform well in correcting for meteorological variability and identifying long-term trends in ambient pollution related to changes in emissions. The estimation errors, characterized as the differences between meteorology-corrected trends and emission-driven trends under constant meteorology scenarios, can be reduced by 30 %–42 % using a random forest model that incorporates both local and regional scale meteorological features. We further design a correction method based on GEOS-Chem simulations with constant emission input and quantify the degree to which emissions and meteorological influences are inseparable, due to their process-based interactions. We conclude by providing recommendations for evaluating the effectiveness of emissions reduction policies using statistical approaches.

show abstract

“…However, due to the complex nature of atmospheric processes and their nonlinear behavior, these regressions often fail to capture the spatial and temporal variations of these processes with sufficient accuracy. 22 Applying machine learning models in the field of atmospheric science could remedy such drawbacks.…”

Section: Introductionmentioning

confidence: 99%

“…The use of multivariate linear regressions is deployed extensively in the field of aerosol science because it is a useful and simple tool that provides insight into different factors that may be simultaneously affecting one dependent parameter in the atmosphere (e.g., refs − ). However, due to the complex nature of atmospheric processes and their nonlinear behavior, these regressions often fail to capture the spatial and temporal variations of these processes with sufficient accuracy . Applying machine learning models in the field of atmospheric science could remedy such drawbacks.…”

Section: Introductionmentioning

confidence: 99%

Predicting Atmospheric Water-Soluble Organic Mass Reversibly Partitioned to Aerosol Liquid Water in the Eastern United States

El-Sayed,

Parida,

Shekhar

et al. 2023

Environ. Sci. Technol.

View full text Add to dashboard Cite

Water-soluble organic matter (WSOM) formed through aqueous processes contributes substantially to total atmospheric aerosol, however, the impact of water evaporation on particle concentrations is highly uncertain. Herein, we present a novel approach to predict the amount of evaporated organic mass induced by sample drying using multivariate polynomial regression and random forest (RF) machine learning models. The impact of particle drying on fine WSOM was monitored during three consecutive summers in Baltimore, MD (2015, 2016, and 2017). The amount of evaporated organic mass was dependent on relative humidity (RH), WSOM concentrations, isoprene concentrations, and NO x /isoprene ratios. Different models corresponding to each class were fitted (trained and tested) to data from the summers of 2015 and 2016 while model validation was performed using summer 2017 data. Using the coefficient of determination (R 2) and the root-mean-square error (RMSE), it was concluded that an RF model with 100 decision trees had the best performance (R 2 of 0.81) and the lowest normalized mean error (NME < 1%) leading to low model uncertainties. The relative feature importance for the RF model was calculated to be 0.55, 0.2, 0.15, and 0.1 for WSOM concentrations, RH levels, isoprene concentrations, and NO x /isoprene ratios, respectively. The machine learning model was thus used to predict summertime concentrations of evaporated organics in Yorkville, Georgia, and Centerville, Alabama in 2016 and 2013, respectively. Results presented herein have implications for measurements that rely on sample drying using a machine learning approach for the analysis and interpretation of atmospheric data sets to elucidate their complex behavior.

show abstract

Statistical and machine learning methods for evaluating trends in air quality under changing meteorological conditions

Cited by 22 publications

References 66 publications

Daily Local-Level Estimates of Ambient Wildfire Smoke PM_2.5 for the Contiguous US

Daily Local-Level Estimates of Ambient Wildfire Smoke PM_2.5 for the Contiguous US

Statistical and machine learning methods for evaluating trends in air quality under changing meteorological conditions

Predicting Atmospheric Water-Soluble Organic Mass Reversibly Partitioned to Aerosol Liquid Water in the Eastern United States

Contact Info

Product

Resources

About

Statistical and machine learning methods for evaluating trends in air quality under changing meteorological conditions

Cited by 22 publications

References 66 publications

Daily Local-Level Estimates of Ambient Wildfire Smoke PM2.5 for the Contiguous US

Daily Local-Level Estimates of Ambient Wildfire Smoke PM2.5 for the Contiguous US

Statistical and machine learning methods for evaluating trends in air quality under changing meteorological conditions

Predicting Atmospheric Water-Soluble Organic Mass Reversibly Partitioned to Aerosol Liquid Water in the Eastern United States

Contact Info

Product

Resources

About

Daily Local-Level Estimates of Ambient Wildfire Smoke PM_2.5 for the Contiguous US

Daily Local-Level Estimates of Ambient Wildfire Smoke PM_2.5 for the Contiguous US