2014
DOI: 10.1007/978-3-319-06608-0_40
|View full text |Cite
|
Sign up to set email alerts
|

A Graph Matching Method for Historical Census Household Linkage

Abstract: Abstract. Linking historical census data across time is a challenging task due to various reasons, including data quality, limited individual information, and changes to households over time. Although most census data linking methods link records that correspond to individual household members, recent advances show that linking households as a whole provide more accurate results and less multiple household links. In this paper, we introduce a graph-based method to link households, which takes the structural re… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
20
0

Year Published

2014
2014
2019
2019

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 25 publications
(20 citation statements)
references
References 17 publications
0
20
0
Order By: Relevance
“…We propose a graph-based solution to address the scalability problem. 2,[7][8][9][10][11][12] We define nodes as unique sets of patient identifying information as recorded in the NHLS database. We define weighted edges as the scored comparisons between these nodes, with scores calculated using a modified version of the Fellegi-Sunter approach.…”
Section: Using Graphs To Guide Record Linkagementioning
confidence: 99%
“…We propose a graph-based solution to address the scalability problem. 2,[7][8][9][10][11][12] We define nodes as unique sets of patient identifying information as recorded in the NHLS database. We define weighted edges as the scored comparisons between these nodes, with scores calculated using a modified version of the Fellegi-Sunter approach.…”
Section: Using Graphs To Guide Record Linkagementioning
confidence: 99%
“…Most projects in historical record linkage are challenged by low data quality (due to scanning and transcription errors of handwri en forms), as well as a lack of ground truth data (which is difficult and expensive to obtain). erefore, research in this area has concentrated on either exploiting the structure in such data sets (such as households and families) and developed group linkage methods [8,13,14,24] or Algorithm 1: Pair-wise similarity graph generation Input: -R:…”
Section: Related Workmentioning
confidence: 99%
“…As with other historical data sets [1,14], this birth data set has a very small number of unique name values (2,055 first names and only 547 last names). As Figure 3 shows, the frequency distributions of names are also very skewed.…”
Section: Experimental Evaluationmentioning
confidence: 99%
“…census records (Fu et al 2011) and publication records . census records (Fu et al 2011) and publication records .…”
Section: Identification Of Duplicatesmentioning
confidence: 99%