Political scientists have increasingly relied on Bayesian Improved Surname Geocoding (BISG) to predict individual-level race and ethnicity using surnames and geographic information. However, these predictions are prone to systematic measurement error when geographic variables are included as independent variables, leading to bias and an increased likelihood of false positives. We compare five estimation methods, including approaches that use names, geography, and images. We find that a hybrid approach that combines surname-based Bayesian estimation with the use of publicly available images in a convolutional neural network not only reduces bias in downstream analyses, but also improves predictive accuracy in a sample of over 16,000 local elected officials. We conclude with a discussion of ethical considerations and describe domains where the hybrid approach might be especially suitable.
Transformative changes in urban economies are raising vital questions about minority representation. Given that cities are sites of political power for communities of color, gentrification and the housing affordability crisis threaten to deteriorate decades of progress. This paper considers the impact of these economic and demographic shifts on minority candidate success and supply. Collecting data on 166 city councils across several decades, we find that white population growth is associated with reductions in local political power for Black and Latino councilors. We also observe modest evidence that local economic improvements may not have deleterious effects on city council diversity. We probe these findings using data on local elections as well as over 380,000 tweets of city councilors, and uncover evidence of a candidate supply mechanism in the case of ``racial gentrification'' and a credit-claiming mechanism in the case of ``economic gentrification.'' We conclude by discussing the political implications of the cross-cutting effects we observe.
The growth of machine learning techniques and massive administrative data sets has dovetailed with increasing interest in racial inequality in political science. While this has invigorated research on race and politics, scholarly investments in Big Data have proceeded without sufficient attention to ethical considerations governing the use of publicly available, but sensitive, information. This paper closes this gap by providing empirical evidence on perceptions of ethics regarding data use by participants themselves. In two large-sample survey experiments, we examine how members of the public perceive research ethics in studies of racial and ethnic politics that leverage public and administrative records, and provide practical recommendations for scholars using these data sets in their research. By highlighting this growing gap, we hope to spark conversation regarding ethics as the academic community confronts these dramatic advances in data collection and statistical analysis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.