To address limitations of observational epidemiology studies of air pollution and health effects, including residual confounding by temporal and spatial factors, several studies have taken advantage of ‘natural experiments’, where an environmental policy or air quality intervention has resulted in reductions in ambient air pollution concentrations. Researchers have examined whether the population impacted by these air quality improvements, also experienced improvements in various health indices (e.g. reduced morbidity/mortality). In this paper, I review key accountability studies done previously and new studies done over the past several years in Beijing, Atlanta, London, Ireland, and other locations, describing study design and analysis strengths and limitations of each. As new ‘natural experiment’ opportunities arise, several lessons learned from these studies should be applied when planning a new accountability study. Comparison of health outcomes during the intervention to both before and after the intervention in the population of interest, as well as use of a control population to assess whether any temporal changes in the population of interest were also seen in populations not impacted by air quality improvements, should aid in minimizing residual confounding by these long term time trends. Use of either detailed health records for a population, or prospectively collected data on relevant mechanistic biomarkers coupled with such morbidity/mortality data may provide a more thorough assessment of if the intervention beneficially impacted the health of the community, and if so by what mechanism(s). Further, prospective measurement of a large suite of air pollutants may allow a more thorough understanding of what pollutant source(s) is/are responsible for any health benefit observed. The importance of using multiple statistical analysis methods in each paper and the difference in how the timing of the air pollution/outcome association may impact which of these design features is most important is also discussed. Based on these and other lessons learned, researchers may provide a more epidemiologically rigorous evaluation of cause-specific health impacts of an air quality intervention or action.