Air pollution is a major public health concern, and large numbers of epidemiological studies have been conducted to quantify its impacts. One study design used to quantify these impacts is a spatial areal unit design, which estimates a population-level association using data on air pollution concentrations and disease incidence that have been spatially aggregated to a set of nonoverlapping areal units. A major criticism of this study design is that the specification of these areal units is arbitrary, and if one changed their boundaries then the aggregated data would change despite the locations of the disease cases and the air pollution surface remaining the same. This is known as the modifiable areal unit problem, and this is the first article to quantify its likely effects in air pollution and health studies. In addition, we derive an aggregate model for these data directly from an idealized individual-level risk model and show that it provides better estimation than the commonly used ecological model. Our work is motivated by a new study of air pollution and health in Scotland, and we find consistent significant associations between air pollution and respiratory disease but not for circulatory disease. K E Y W O R D S epidemiological study, nitrogen dioxide and particulate matter, spatially aggregated disease counts 1 INTRODUCTION Air pollution continues to be a major global public health problem, with the World Health Organisation (WHO) linking seven million deaths to it each year worldwide (World Health Organisation, 2016). In the United Kingdom an estimated 40,000 deaths are attributed to air pollution exposure each year (Royal College of Physicians, 2016), and legal limits for individual air pollutants have been stipulated that must not be exceeded (Department for the Environment, Food and Rural Affairs, 2015). The choice of these limits is informed by epidemiological studies, and both individualand ecological (or group)-level study designs are prevalent in the literature. Ecological-level studies are popular because they utilize available population-level disease incidence data, which makes them comparatively fast and inexpensive to implement. However, all that can be inferred is a population-level association rather than an individual-level causal relationship, and wrongly assuming the two are the same is known as ecological bias (Wakefield & Salway, 2001). Such bias is due in part to within-population variation in pollution exposures and disease incidence, because one does not This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.