2015
DOI: 10.3233/sji-150892
|View full text |Cite
|
Sign up to set email alerts
|

``Re-make/Re-model'': Should big data change the modelling paradigm in official statistics?

Abstract: Big data offers many opportunities for official statistics: for example increased resolution, better timeliness, and new statistical outputs. But there are also many challenges: uncontrolled changes in sources that threaten continuity, lack of identifiers that impedes linking to population frames, and data that refers only indirectly to phenomena of statistical interest. We discuss two approaches to deal with these challenges and opportunities. First, we may accept big data for what they are: an imperfect, yet… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2016
2016
2020
2020

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 20 publications
(9 citation statements)
references
References 9 publications
0
8
0
Order By: Relevance
“…Nonetheless, the use of internet sources for statistical purposes should be used with caution. Selection bias is a predominant issue due to the uneven Internet penetration among and within countries, the population covered by these sources is also subject to daily changes, and often there is difficulty in linking the data to other datasets [31,32,33]. In detail, for search queries, one must also be wary of several factors such as changes to the search algorithm [34] and media events which lead to an unexpected behavior [35].…”
Section: Introductionmentioning
confidence: 99%
“…Nonetheless, the use of internet sources for statistical purposes should be used with caution. Selection bias is a predominant issue due to the uneven Internet penetration among and within countries, the population covered by these sources is also subject to daily changes, and often there is difficulty in linking the data to other datasets [31,32,33]. In detail, for search queries, one must also be wary of several factors such as changes to the search algorithm [34] and media events which lead to an unexpected behavior [35].…”
Section: Introductionmentioning
confidence: 99%
“…The goal is to extrapolate this decision construct to using sources of data that were collected for other reasons and repurposing them for new research purposes. For example, how would one adopt concepts such as fitness-for-use, relative measures of data quality, and timeliness (Agafitei et al 2015, Braaksma & Zeelenberg 2015, Couper 2013? One could weigh the tradeoffs of lower quality data with increased timeliness in the short run and the reverse in the longer run.…”
Section: Decision-theoretic Approachmentioning
confidence: 99%
“…Current conditions require a wider set of data quality dimensions, including privacy, security, and complexity (UNECE 2015). In addition, the notion that data should be viewed in terms of potential new trade-offs (timeliness versus representativeness), as increasing efficiency when combining data sources, and as potentially generating new data products (Braaksma & Zeelenberg 2015) casts data quality as relational among all data sources. The use of repositories may help to ensure the quality and accessibility of these data to advance social and behavioral research (Petrakos et al 2014).…”
Section: Official Statisticsmentioning
confidence: 99%
“…These nonstatistically designed sources of data are intoxicating in that they provide easily accessible and often inexpensive information about individuals, businesses, and society. They offer possibilities for studying behavior and social drivers of population attributes and characteristics at a finer level of geographic and demographic resolution and in more frequent time intervals than do survey and census data (Braaksma & Zeelenberg 2015). They hold the promise of understanding human interactions at a societal scale, within a context of rich spatial and temporal dynamics, and for detecting complex interactions and nonlinearities among variables (Agafiţei et al 2015).…”
Section: Advantages and Disadvantages Of Nonstatistically Designed Somentioning
confidence: 99%