2017
DOI: 10.3386/w24019
|View full text |Cite
|
Sign up to set email alerts
|

How Well Do Automated Linking Methods Perform? Lessons from U.S. Historical Data

Abstract: for their many contributions to the LIFE-M project. NBER working papers are circulated for discussion and comment purposes. They have not been peer-reviewed or been subject to the review by the NBER Board of Directors that accompanies official NBER publications.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
42
1

Year Published

2017
2017
2024
2024

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 31 publications
(43 citation statements)
references
References 16 publications
0
42
1
Order By: Relevance
“…Nevertheless, a linking rate of less than 100 percent is due to death, common names, and errors of data entry from either the initial census enumerators or clerks that digitized the data. This method of linking certainly leads to some false links; nonetheless, this method has been shown to produce reliable intergenerational elasticity estimates (Bailey et al 2017).…”
Section: A Data Creationmentioning
confidence: 99%
“…Nevertheless, a linking rate of less than 100 percent is due to death, common names, and errors of data entry from either the initial census enumerators or clerks that digitized the data. This method of linking certainly leads to some false links; nonetheless, this method has been shown to produce reliable intergenerational elasticity estimates (Bailey et al 2017).…”
Section: A Data Creationmentioning
confidence: 99%
“…Mis-matched data. There is considerable debate in the economic history community about the quality of linked data and how it varies based on various matching methods (Bailey et al, 2017, Abramitzky et al, 2019. We test whether the quality of the match influences our results.…”
Section: Assessing the Potential Impact Of Missing Or Low Quality Datmentioning
confidence: 97%
“…In the second generation analyses, I code each ethnicity based on his parent's birthplace, mother tongue, and Jewish index. Research that uses large-scale historical record linkage has led to questions about how representative these new samples are of the population (Bailey et al 2017). As noted in the methods section, the first and second generation matched samples produce different means than the full population along various dimensions.…”
Section: Appendix A: Coding For Ethnicitymentioning
confidence: 99%
“…The first generation results come from the fixed effects models while the second generation effects control for variables described in Table 6 Suppressed coefficients are available upon request. +.05<p<.1, *p<.05, **p<.01, ***p<.001 (two-tailed) In addition to questions about representativeness, recent evidence from Bailey et al (2017) suggest that the iterative matching approach using the soundex algorithm defined in the methods section may be particularly sensitive to false linkages (i.e. matching person A in census A to person B in census B).…”
Section: Appendix A: Coding For Ethnicitymentioning
confidence: 99%