2016
DOI: 10.1111/rssa.12179
|View full text |Cite
|
Sign up to set email alerts
|

Estimating the Density of Ethnic Minorities and Aged People in Berlin: Multivariate Kernel Density Estimation Applied to Sensitive Georeferenced Administrative Data Protected Via Measurement Error

Abstract: Modern systems of official statistics require the timely estimation of area-specific densities of subpopulations. Ideally estimates should be based on precise geocoded information, which is not available because of confidentiality constraints. One approach for ensuring confidentiality is by rounding the geoco-ordinates. We propose multivariate non-parametric kernel density estimation that reverses the rounding process by using a measurement error model. The methodology is applied to the Berlin register of resi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
17
0
1

Year Published

2016
2016
2024
2024

Publication Types

Select...
6
2

Relationship

3
5

Authors

Journals

citations
Cited by 20 publications
(18 citation statements)
references
References 41 publications
0
17
0
1
Order By: Relevance
“…The generation of the pseudo-sample step brings a stochastic element into the algorithm, giving a more realistic distribution of the missing observations. In our application, the missing data are the exact geo-coordinates and the maximization of a likelihood is replaced by the kernel density estimation (a generalised SEM, Groß et al 2017). In contrast to SEM, a simple EM algorithm would clearly not be helpful in this application, as all observations within an area would fall on the same location and thus not prevent a bias of the resulting kernel density estimate.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The generation of the pseudo-sample step brings a stochastic element into the algorithm, giving a more realistic distribution of the missing observations. In our application, the missing data are the exact geo-coordinates and the maximization of a likelihood is replaced by the kernel density estimation (a generalised SEM, Groß et al 2017). In contrast to SEM, a simple EM algorithm would clearly not be helpful in this application, as all observations within an area would fall on the same location and thus not prevent a bias of the resulting kernel density estimate.…”
Section: Methodsmentioning
confidence: 99%
“…Therefore, we present an approach in which geo-coordinates are simulated from area-specific aggregates. The method proposed in this work is similar to the approach of Groß et al (2017), where it is used to counteract the rounding of geo-coordinates due to confidentiality reasons. In their analysis, kernel densities are generated to detect concentration areas of migrants and elderly persons in Berlin.…”
Section: Introductionmentioning
confidence: 99%
“…To avoid such problems, various types of location protecting techniques have been used to protect these privacies [4,5,9,11,13,14]. Geocoding of crime events to a street segment or a city block can encrypt their precise geographic locations [4][5][6]. Transforming a point map to a density map can help mask the exact location of crime events [4,5,11,16,39].…”
Section: Literature Reviewmentioning
confidence: 99%
“…Kernel density estimation (KDE) is widely adopted to show the overall crime distribution and at the same time obscure exact crime locations due to the confidentiality of crime data in many countries [1][2][3][4][5][6]. In the field of crime prediction, KDE is widely used to generate hotspots, which are then used to guide police for targeted intervention.…”
Section: Introductionmentioning
confidence: 99%
“…As a consequence, the proposed estimator is a partly Bayesian method in the sense that the X i as well as p, a, β and τ are treated as random variables but not θ. As already discussed in Groß et al (2015) this approach is equal to a generalized Stochastic Expectation Maximization (SEM) algorithm (Celeux et al 1996). This algorithm is strongly related to the Gibbs-sampler but usually converges much faster (Diebolt et al 1994).…”
mentioning
confidence: 99%