In many contexts, confidentiality constraints severely restrict access to unique and valuable microdata. Synthetic data which mimic the original observed data and preserve the relationships between variables but do not contain any disclosive records are one possible solution to this problem. The synthpop package for R, introduced in this paper, provides routines to generate synthetic versions of original data sets. We describe the methodology and its consequences for the data characteristics. We illustrate the package features using a survey data example.
Standard-Nutzungsbedingungen:Die Dokumente auf EconStor dürfen zu eigenen wissenschaftlichen Zwecken und zum Privatgebrauch gespeichert und kopiert werden.Sie dürfen die Dokumente nicht für öffentliche oder kommerzielle Zwecke vervielfältigen, öffentlich ausstellen, öffentlich zugänglich machen, vertreiben oder anderweitig nutzen.Sofern die Verfasser die Dokumente unter Open-Content-Lizenzen (insbesondere CC-Lizenzen) zur Verfügung gestellt haben sollten, gelten abweichend von diesen Nutzungsbedingungen die in der dort genannten Lizenz gewährten Nutzungsrechte. www.econstor.eu The Institute for the Study of Labor (IZA) in Bonn is a local and virtual international research center and a place of communication between science, politics and business. IZA is an independent nonprofit organization supported by Deutsche Post Foundation. The center is associated with the University of Bonn and offers a stimulating research environment through its international network, workshops and conferences, data service, project support, research visits and doctoral program. IZA engages in (i) original and internationally competitive research in all fields of labor economics, (ii) development of policy concepts, and (iii) dissemination of research results and concepts to the interested public. Terms of use: Documents in D I S C U S S I O N P A P E R S E R I E SIZA Discussion Papers often represent preliminary work and are circulated to encourage discussion. Citation of such a paper should account for its provisional character. A revised version may be available directly from the author.IZA Discussion Paper No. 6140 November 2011 ABSTRACT Does Migration Make You Happy? A Longitudinal Study of Internal Migration and Subjective Well-BeingThe majority of modelling studies on consequences of internal migration focus almost exclusively on the labour market outcomes and the material well-being of migrants. We investigate whether individuals who migrate within the UK become happier after the move than they were before it and whether the effect is permanent or transient. Using life satisfaction responses from 12 waves of the British Household Panel Survey (BHPS) and employing a fixed-effects model, we derive a temporal pattern of migrants' subjective wellbeing (SWB) around the time of the migration event. Our findings make an original contribution by revealing for the first time that, on average, migration is preceded by a period when individuals experience a significant decline in happiness. The boost that is received through migration appears to bring people back to their initial level of happiness. As opposed to labour market outcomes of migration, SWB outcomes do not differ significantly between men and women. Perhaps surprisingly, long-distance migrants are at least as happy as short-distance migrants despite the higher social costs that are involved.JEL Classification: J61, R23
Summary. Data holders can produce synthetic versions of data sets when concerns about potential disclosure restrict the availability of the original records. The paper is concerned with methods to judge whether such synthetic data have a distribution that is comparable with that of the original data: what we term general utility. We consider how general utility compares with specific utility: the similarity of results of analyses from the synthetic data and the original data. We adapt a previous general measure of data utility, the propensity score mean-squared error pMSE, to the specific case of synthetic data and derive its distribution for the case when the correct synthesis model is used to create the synthetic data. Our asymptotic results are confirmed by a simulation study. We also consider two specific utility measures, confidence interval overlap and standardized difference in summary statistics, which we compare with the general utility results. We present two contrasting examples of data syntheses: one illustrating synthetic data that is evaluated as being useful by both general and specific measures and the second where neither is the case. For the second case we show how the general utility measures can identify the deficiencies of the synthetic data and suggest how this can inform possible improvements to the synthesis method.
We describe results on the creation and use of synthetic data that were derived in the context of a project to make synthetic extracts available for users of the UK Longitudinal Studies. A critical review of existing methods of inference from large synthetic data sets is presented. We introduce new variance estimates for use with large samples of completely synthesised data that do not require them to be generated from the posterior predictive distribution derived from the observed data and can be used with a single synthetic data set. We make recommendations on how to synthesise data based on these findings. An example of synthesising data from the Scottish Longitudinal Study is included to illustrate our results.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.