The quantity and quality of administrative information available to National Statistical Institutes have been constantly increasing over the past several years. However, different sources of administrative data are not expected to each have the same population coverage, so that estimating the true population size from the collective set of data poses several methodological challenges that set the problem apart from a classical capture-recapture setting. In this article, we consider two specific aspects of this problem: (1) misclassification of the units, leading to lists with both overcoverage and undercoverage; and (2) lists focusing on a specific subpopulation, leaving a proportion of the population with null probability of being captured. We propose an approach to this problem that employs a class of capturerecapture methods based on Latent Class models. We assess the proposed approach via a simulation study, then apply the method to five sources of empirical data to estimate the number of active local units of Italian enterprises in 2011.
We propose a method for estimating the size of a population in a multiple record system in the presence of missing data. The method is based on a latent class model where the parameters and the latent structure are estimated using a Gibbs sampler. The proposed approach is illustrated through the analysis of a data set already known in the literature, which consists of five registrations of neural tube defects.
The identification and treatment of “one‐inflation” in estimating the size of an elusive population has received increasing attention in capture–recapture literature in recent years. The phenomenon occurs when the number of units captured exactly once clearly exceeds the expectation under a baseline count distribution. Ignoring one‐inflation has serious consequences for estimation of the population size, which can be drastically overestimated. In this paper we propose a Bayesian approach for Poisson, geometric, and negative binomial one‐inflated count distributions. Posterior inference for population size will be obtained applying a Gibbs sampler approach. We also provide a Bayesian approach to model selection. We illustrate the proposed methodology with simulated and real data and propose a new application in official statistics to estimate the number of people implicated in the exploitation of prostitution in Italy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.