Differentially private data releases are often required to satisfy a set of external constraints that reflect the legal, ethical, and logical mandates to which the data curator is obligated. The enforcement of constraints, when treated as post-processing, adds an extra phase in the production of privatized data. It is well understood in the theory of multi-phase processing that congeniality, a form of procedural compatibility between phases, is a prerequisite for the end users to straightforwardly obtain statistically valid results. Congenial differential privacy is theoretically principled, which facilitates transparency and intelligibility of the mechanism that would otherwise be undermined by ad-hoc post-processing procedures. We advocate for the systematic integration of mandated disclosure into the design of the privacy mechanism via standard probabilistic conditioning on the invariant margins. Conditioning automatically renders congeniality because any extra post-processing phase becomes unnecessary. We provide both initial theoretical guarantees and a Markov chain algorithm for our proposal. We also discuss intriguing theoretical issues that arise in comparing congenital differential privacy and optimization-based post-processing, as well as directions for further research.
Differential privacy protects individuals' confidential information by subjecting data summaries to probabilistic perturbation mechanisms, carefully designed to minimize undue sacrifice of statistical efficiency. When properly accounted for, differentially private data are conducive to exact inference when approximate computation techniques are employed. This paper shows that approximate Bayesian computation, a practical suite of methods to simulate from approximate posterior distributions of complex Bayesian models, produces exact posterior samples when applied to differentially private perturbation data. An importance sampling implementation of Monte Carlo expectation-maximization for likelihood inference is also discussed. The results illustrate a duality between approximate computation on exact data, and exact computation on approximate data. A cleverly designed inferential procedure exploits the alignment between the statistical tradeoff of privacy versus efficiency, and the computational tradeoff of approximation versus exactness, so that paying the cost of one gains the benefit of both.
We introduce a new method for updating subjective beliefs based on Jeffrey's rule of conditioning, called dynamic (precise) probability kinematics (DPK). We also give its generalization in order to work with sets of probabilities, called dynamic imprecise probability kinematics (DIPK). Updating a set of probabilities may be computationally costly. To this end, we provide bounds for the lower probability associated with the updated probability set, characterizing the set completely. The behavior of the updated sets of probabilities are studied, including contraction, dilation, and sure loss. We discuss the application of DPK and DIPK to survey sampling studies in which coarse and imprecise observations are anticipated.
In a technical treatment, this article establishes the necessity of transparent privacy for drawing unbiased statistical inference for a wide range of scientific questions.Transparency is a distinct feature enjoyed by differential privacy: the probabilistic mechanism with which the data are privatized can be made public without sabotaging the privacy guarantee. Uncertainty due to transparent privacy may be conceived as a dynamic and controllable component from the total survey error perspective. As the 2020 U.S. Decennial Census adopts differential privacy, constraints imposed on the privatized data products through optimization constitute a threat to transparency and result in limited statistical usability. Transparent privacy presents a viable path toward principled inference from privatized data releases, and shows great promise toward improved reproducibility, accountability, and public trust in modern data curation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.