2021
DOI: 10.48550/arxiv.2112.10912
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Common Misconceptions about Population Data

Abstract: Databases covering all individuals of a population are increasingly used for research studies in domains ranging from public health to the social sciences. There is also growing interest by governments and businesses to use population data to support data-driven decision making. The massive size of such databases is often mistaken as a guarantee for valid inferences on the population of interest. However, population data have characteristics that make them challenging to use, including various assumptions bein… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
5
0
1

Year Published

2022
2022
2023
2023

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(6 citation statements)
references
References 54 publications
0
5
0
1
Order By: Relevance
“…A diverse range of PPRL techniques has been developed [42], including techniques based on secure multiparty computation (SMC), secure hash encoding, and encoding of values into bit vectors. While SMC techniques are accurate and provably secure, because PPRL generally requires the calculation of similarities between encoded values (due to errors and variations that can occur in QID values [9]) these techniques often have high computational costs [16]. PPRL techniques based on some form of hashing or embedding of sensitive values, known as perturbation-based techniques [43], on the other hand provide adequate privacy, linkage quality, and scalability to link large sensitive databases.…”
Section: Introductionmentioning
confidence: 99%
“…A diverse range of PPRL techniques has been developed [42], including techniques based on secure multiparty computation (SMC), secure hash encoding, and encoding of values into bit vectors. While SMC techniques are accurate and provably secure, because PPRL generally requires the calculation of similarities between encoded values (due to errors and variations that can occur in QID values [9]) these techniques often have high computational costs [16]. PPRL techniques based on some form of hashing or embedding of sensitive values, known as perturbation-based techniques [43], on the other hand provide adequate privacy, linkage quality, and scalability to link large sensitive databases.…”
Section: Introductionmentioning
confidence: 99%
“…4 O segundo desafio diz respeito à falta de familiaridade de pesquisadores e gestores com a utilização de bases de dados secundárias volumosas e complexas para fins de vigilância, avaliação e pesquisa, o que pode levar a interpretações incorretas das evidências geradas. 2,6 Ao contrário do que ocorre com dados primários coletados para resposta a uma pergunta específica de pesquisa, pesquisadores e gestores, em geral, não têm controle sobre os processos de geração e processamento dos conjuntos de dados secundários. Ao se utilizar um banco de dados secundários…”
unclassified
“…These data, especially when linked, have been increasingly used in health, research, surveillance and evaluation activities, as well as in decision-making. 2 …”
mentioning
confidence: 99%
See 2 more Smart Citations