Introduction Research using linked routine population-based data collected for non-research purposes has increased in recent years because they are a rich and detailed source of information. The objective of this study is to present an approach to prepare and link data from administrative sources in a middle-income country, to estimate its accuracy and to identify potential sources of bias by comparing linked and no-linked case. Methods We linked two administrative datasets with data covering the period 2001 to 2015, using maternal attributes (maternal name, age, date of birth, and municipally of residence) from Brazil: live birth information system and the baseline of the 100 Million Brazilian Cohort (created using administrative records from over 114 million individuals whose families applied for social assistance via the National Register for Social Programmes) implementing an in house developed linkage tool CIDACS-RL. We then estimated the accuracy of the linkage and examined the characteristics of missed-matches to identify any potential source of bias. Results A total of 27,699,891 live births were recorded of those, 16,447,414 (59.4%) were linked with SINASC. The sensitivity of the linkage ranged from 39.3% in 2001 to 82.1% in 2014. A substantial improvement in the linkage sensitivity after the introduction of maternal date of birth attribute, in 2011, was observed. Our analyses indicated a slightly higher proportion of missing data among missed matches and a higher proportion of people living in an urban area and self-declared as Caucasian among linked pairs when compared with non-linked sets. Discussion We demonstrated that CIDACS-RL is capable of performing high quality and accurate linkage even with a limited number of common attributes, using indexation as a blocking strategy in large routine databases from a middle-income country. However, residual records occurred more among people under worse living conditions. The results presented in this study reinforce the need of evaluating linkage quality and when necessary to take linkage error into account for the analyses of any generated dataset.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations鈥揷itations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright 漏 2025 scite LLC. All rights reserved.
Made with 馃挋 for researchers
Part of the Research Solutions Family.