International migrants comprised 14% of the UK’s population in 2020; however, their health is rarely studied at a population level using primary care electronic health records due to difficulties in their identification. We developed a migration phenotype using country of birth, visa status, non-English main/first language and non-UK-origin codes and applied it to the Clinical Practice Research Datalink (CPRD) GOLD database of 16,071,111 primary care patients between 1997 and 2018. We compared the completeness and representativeness of the identified migrant population to Office for National Statistics (ONS) country-of-birth and 2011 census data by year, age, sex, geographic region of birth and ethnicity. Between 1997 to 2018, 403,768 migrants (2.51% of the CPRD GOLD population) were identified: 178,749 (1.11%) had foreign-country-of-birth or visa -status codes, 216,731 (1.35%) non-English-main/first-language codes, and 8288 (0.05%) non-UK-origin codes. The cohort was similarly distributed versus ONS data by sex and region of birth. Migration recording improved over time and younger migrants were better represented than those aged ≥50. The validated phenotype identified a large migrant cohort for use in migration health research in CPRD GOLD to inform healthcare policy and practice. The under-recording of migration status in earlier years and older ages necessitates cautious interpretation of future studies in these groups.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.