2022
DOI: 10.48550/arxiv.2206.14583
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Benchmarking Bayesian Improved Surname Geocoding Against Machine Learning Methods

Abstract: Bayesian Improved Surname Geocoding (BISG) is the most popular method for proxying race/ethnicity in voter registration files that do not contain it. This paper benchmarks BISG against a range of previously untested machine learning alternatives, using voter files with self-reported race/ethnicity from California, Florida, North Carolina, and Georgia. This analysis yields three key findings. First, when given the exact same inputs, BISG and machine learning perform similarly for estimating aggregate racial/eth… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 15 publications
0
2
0
Order By: Relevance
“…We observe this pattern as well in the validation study in Section 4. New methods continue to be developed that improve the calibration of BISG probabilities, including some machine learning methods based on labeled data (Imai et al, 2022;Argyle and Barber, 2022;Decter-Frain, 2022).…”
Section: Setup and Bisg Proceduresmentioning
confidence: 99%
See 1 more Smart Citation
“…We observe this pattern as well in the validation study in Section 4. New methods continue to be developed that improve the calibration of BISG probabilities, including some machine learning methods based on labeled data (Imai et al, 2022;Argyle and Barber, 2022;Decter-Frain, 2022).…”
Section: Setup and Bisg Proceduresmentioning
confidence: 99%
“…Finally, while the discussion in this section has been focused on the BISG methodology, the qualitative results and necessary assumptions carry over to other approaches which produce probabilistic predictions of individual race, such as those recently developed by Argyle and Barber (2022) and Decter-Frain (2022). Just as with BISG, well-calibrated probabilities are not generally sufficient to produce unbiased estimates of racial disparities using the outputs of these methods.…”
Section: March 7 2023mentioning
confidence: 99%