Scientific question
With the new individual- and activity-based approaches to simulating exposure to air pollutants, exposure models must now provide synthetic populations that realistically reflect the demographic profiles of individuals in an urban territory. Demographic profiles condition the behavior of individuals in urban space (activities, mobility) and determine the resulting risks of exposure and environmental inequalities. In this context, there is a strong need to determine the relevance of the population modeling methods to reproduce the combinations of socio-demographic parameters in a population from the existing databases. The difficulty of accessing complete, high-resolution databases indeed proves to be very limiting for the ambitions of the different approaches.
Objective
This work proposes to evaluate the potential of a statistical approach for the numerical modeling of synthetic populations, at the scale of dwellings and including the representation of coherent socio-demographic profiles. The approach is based on and validated against the existing open databases. The ambition is to be able to build upon such synthetic populations to produce a comprehensive assessment of the risk of environmental exposure that can be cross-referenced with lifestyles, indicators of social, professional or demographic category, and even health vulnerability data.
Method
The approach implemented here is based on the use of conditional probabilities to model the socio-demographic properties of individuals, via the deployment of a Monte Carlo Markov Chain (MCMC) simulation. Households are assigned to housing according to income and house price classes. The resulting population generation model was tested in the Paris region (Ile de France) for the year 2010, and applied to a population of almost 12 million individuals. The approach is based on the use of census and survey databases.
Results
Validation, carried out by comparison with regional census data, shows that the model accurately reproduces the demographic attributes of individuals (age, gender, professional category, income) as well as their combination, at both regional and sub-municipal levels. Notably, population distribution at the scale of the model buildings remains consistent with observed data patterns.
Conclusions and relevance
The outcomes of this work demonstrate the ability of our approach to create, from public data, a coherent synthetic population with broad socio-demographic profiles. They give confidence for the use of this approach in an activity-based air quality exposure study, and thus for exploring the interrelations between social determinants and environmental risks. The non-specific nature of this work allows us to consider its extension to broader demographic profiles, including health indicators, and to different study regions.