Background: Advances in computing power have enabled the collection, linkage and processing of big data. Big data in conjunction with robust causal inference methods can be used to answer research questions regarding the mechanisms underlying an exposure-outcome relationship. The g-formula is a flexible approach to perform causal mediation analysis that is suited for the big data context. Although this approach has many advantages, it is underused in perinatal epidemiology and didactic explanation for its implementation is still limited.
Objective:The aim of this was to provide a didactic application of the mediational gformula by means of perinatal health inequalities research.
Methods:The analytical procedure of the mediational g-formula is illustrated by investigating whether the relationship between neighbourhood socioeconomic status (SES) and small for gestational age (SGA) is mediated by neighbourhood social environment. Data on singleton births that occurred in the Netherlands between 2010 and 2017 (n = 1,217,626) were obtained from the Netherlands Perinatal Registry and linked to sociodemographic national registry data and neighbourhood-level data. The g-formula settings corresponded to a hypothetical improvement in neighbourhood SES from disadvantaged to non-disadvantaged.
Results: At the population level, a hypothetical improvement in neighbourhood SES resulted in a 6.3% (95% confidence interval [CI] 5.2, 7.5) relative reduction in the proportion of SGA, that is the total effect. The total effect was decomposed into the natural direct effect (5.6%, 95% CI 5.1, 6.1) and the natural indirect effect (0.7%, 95% CI 0.6, 0.9). In terms of the magnitude of mediation, it was observed the natural indirect effect accounted for 11.4% (95% CI 9.2, 13.6) of the total effect of neighbourhood SES on SGA.
Conclusions:The mediational g-formula is a flexible approach to perform causal mediation analysis that is suited for big data contexts in perinatal health research. Its