[1] The real-time forecasts of ozone (O 3 ) from seven air quality forecast models (AQFMs) are statistically evaluated against observations collected during July and August of 2004 (53 days) through the Aerometric Information Retrieval Now (AIRNow) network at roughly 340 monitoring stations throughout the eastern United States and southern Canada. One of the first ever real-time ensemble O 3 forecasts, created by combining the seven separate forecasts with equal weighting, is also evaluated in terms of standard statistical measures, threshold statistics, and variance analysis. The ensemble based on the mean of the seven models and the ensemble based on the median are found to have significantly more temporal correlation to the observed daily maximum 1-hour average and maximum 8-hour average O 3 concentrations than any individual model. However, root-mean-square errors (RMSE) and skill scores show that the usefulness of the uncorrected ensembles is limited by positive O 3 biases in all of the AQFMs. The ensembles and AQFM statistical measures are reevaluated using two simple bias correction algorithms for forecasts at each monitor location: subtraction of the mean bias and a multiplicative ratio adjustment, where corrections are based on the full 53 days of available comparisons. The impact the two bias correction techniques have on RMSE, threshold statistics, and temporal variance is presented. For the threshold statistics a preferred bias correction technique is found to be model dependent and related to whether the model overpredicts or underpredicts observed temporal O 3 variance. All statistical measures of the ensemble mean forecast, and particularly the bias-corrected ensemble forecast, are found to be insensitive to the results of any particular model. The higher correlation coefficients, low RMSE, and better threshold statistics for the ensembles compared to any individual model point to their preference as a real-time O 3 forecast.
[1] Forecasts of ozone (O 3 ) and particulate matter (diameter less than 2.5 mm, PM 2.5 ) from seven air quality forecast models (AQFMs) are statistically evaluated against observations collected during August and September of 2006 (49 days) through the Aerometric Information Retrieval Now (AIRNow) network throughout eastern Texas and adjoining states. Ensemble O 3 and PM 2.5 forecasts created by combining the seven separate forecasts with equal weighting, and simple bias-corrected forecasts, are also evaluated in terms of standard statistical measures, threshold statistics, and variance analysis. For O 3 the models and ensemble generally show statistical skill relative to persistence for the entire region, but fail to predict high-O 3 events in the Houston region. For PM 2.5 , none of the models, or ensemble, shows statistical skill, and all but one model have significant low bias. Comprehensive comparisons with the full suite of chemical and aerosol measurements collected aboard the NOAA WP-3 aircraft during the summer 2006 Second Texas Air Quality Study and the Gulf of Mexico Atmospheric Composition and Climate Study (TexAQS II/GoMACCS) field study are performed to help diagnose sources of model bias at the surface. Aircraft flights specifically designed for sampling of Houston and Dallas urban plumes are used to determine model and observed upwind or background biases, and downwind excess concentrations that are used to infer relative emission rates. Relative emissions from the U.S. Environmental Protection Agency 1999 National Emission Inventory (NEI-99) version 3 emissions inventory (used in two of the model forecasts) are evaluated on the basis of comparisons between observed and model concentration difference ratios. Model comparisons demonstrate that concentration difference ratios yield a reasonably accurate measure (within 25%) of relative input emissions. Boundary layer height and wind data are combined with the observed up-wind and downwind concentration differences to estimate absolute emissions. When the NEI-99 inventory is modified to include observed NO y emissions from continuous monitors and expected NO x decreases from mobile sources between 1999 and 2006, good agreement is found with those derived from the observations for both Houston and Dallas. However, the emission inventories consistently overpredict the ratio of CO to NO y . The ratios of ethylene and aromatics to NO y are reasonably consistent with observations over Dallas, but are significantly underpredicted for Houston.
The National Oceanic and Atmospheric Administration recently sponsored the New England Forecasting Pilot Program to serve as a "test bed" for chemical forecasting by providing all of the elements of a National Air Quality Forecasting System, including the development and implementation of an evaluation protocol. This Pilot Program enlisted three regional-scale air quality models, serving as prototypes, to forecast ozone (O 3 ) concentrations across the northeastern United States during the summer of 2002. A suite of statistical metrics was identified as part of the protocol that facilitated evaluation of both discrete forecasts (observed versus modeled concentrations) and categorical forecasts (observed versus modeled exceedances/nonexceedances) for both the maximum 1-hr (125 ppb) and 8-hr (85 ppb) forecasts produced by each of the models. Implementation of the evaluation protocol took place during a 25-day period (August 5-29), utilizing hourly O 3 concentration data obtained from over 450 monitors from the U.S. Environment Protection Agency's Air Quality System network.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.