2018
DOI: 10.2139/ssrn.3214172
|View full text |Cite
|
Sign up to set email alerts
|

Using a Probabilistic Model to Assist Merging of Large-Scale Administrative Records

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
105
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 95 publications
(105 citation statements)
references
References 45 publications
0
105
0
Order By: Relevance
“…Next, the use of LexisNexis linkage provided augmented certification records obtained from the Florida State FMO. We also employed the latest linkage software (FastLink 21 ) which in addition to having superior computational efficiency has the advantage of generating error (or “confusion”) tables estimating linkage accuracy. It was estimated that our FCDS linkage was over 99% accurate with a corresponding sensitivity and specificity of 98.9% and 100%, respectively 65 .…”
Section: Discussionmentioning
confidence: 99%
“…Next, the use of LexisNexis linkage provided augmented certification records obtained from the Florida State FMO. We also employed the latest linkage software (FastLink 21 ) which in addition to having superior computational efficiency has the advantage of generating error (or “confusion”) tables estimating linkage accuracy. It was estimated that our FCDS linkage was over 99% accurate with a corresponding sensitivity and specificity of 98.9% and 100%, respectively 65 .…”
Section: Discussionmentioning
confidence: 99%
“…To this end, the exploratory framework described above can be thought of as modular, in that different steps can be replaced with others as the open‐source code is refined further (such as a methodological advancement similar to Enamorado et al . ). While research involving increasingly bigger, innovative types of open data will only become more important over time (within this domain and others), data originators too have an important role to play in minimizing the costs associated with aggregating and cleaning the data at source.…”
Section: Resultsmentioning
confidence: 97%
“…The gains from the approximate routine are considerably higher than the 30% improvement achieved by Enamorado et al . () when compared against exact matches only. Figure shows what matches occur across which register, highlighting the dominance of the private sector in the local government procurement process.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Since then, however, a number of approaches have been developed which rely heavily on data mining and machine learning. One especially relevant piece of work is Enamorado et al (2017), 2 which develops 'a fast and scalable algorithm to implement the canonical probabilistic model of record linkage' able to 'efficiently handle millions of observations while accounting for missing data and measurement error'. While the scope of this literature is too broad to discuss in detail here, we refer where appropriate to the applicable work in this field at the relevant points in our discussion throughout the remainder of this paper.…”
Section: Technical Literaturementioning
confidence: 99%