On the Convergence and Calibration of Deep Learning with Differential Privacy

Bu, Zhiqi; Wang, Hua; Long, Qi; Su, Weijie

doi:10.48550/arxiv.2106.07830

Cited by 8 publications

(16 citation statements)

References 43 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…4 . The feature clipping technique is different from the DP-SGD [1] algorithm that only requires per-example gradient clipping, and is also different from Bu et al [9] that uses global gradient clipping. The major reason that we use data feature clipping (besides per-example gradient clipping), is for ensuring smoothness of the logistic regression loss function (by Proposition 6.1), which is a necessary condition for applying our privacy bound Theorem 4.4.…”

Section: Experiments Settingmentioning

confidence: 99%

“…For a vast majority of learning tasks (such as private image classification), composition-based privacy analysis DP-SGD [1] is the mainstream analysis method, mainly due to its simplicity and broad applicability. However, private learning via DP-SGD usually suffers from a slow convergence [9], under which the privacy loss is large due to the composition of the privacy loss over a large number of iterations. This, in turn, results in overestimating the magnitude of additive noise needed for differentially private training, and worsens the privacy-accuracy trade-off of the training algorithm.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Differentially Private Learning Needs Hidden State (Or Much Faster Convergence)

Ye¹,

Shokri²

2022

Preprint

View full text Add to dashboard Cite

Differential privacy analysis of randomized learning algorithms typically relies on composition theorems, where the implicit assumption is that the internal state of the iterative algorithm is revealed to the adversary. However, by assuming hidden states for DP algorithms (when only the last-iterate is observable), recent works prove a converging privacy bound for noisy gradient descent (on strongly convex smooth loss function) that is significantly smaller than composition bounds after O(1/step-size) epochs. In this paper, we extend this hidden-state analysis to the noisy mini-batch stochastic gradient descent algorithms on strongly-convex smooth loss functions. We prove converging Rényi DP bounds under various mini-batch sampling schemes, such as "shuffle and partition" (which are used in practical implementations of DP-SGD) and "sampling without replacement". We prove that, in these settings, our privacy bound is much smaller than the composition bound for training with a large number of iterations (which is the case for learning from high-dimensional data). Our converging privacy analysis, thus, shows that differentially private learning, with a tight bound, needs hidden state privacy analysis or a fast convergence. To complement our theoretical results, we run experiment on training classification models on MNIST, FMNIST and CIFAR-10 datasets, and observe a better accuracy given fixed privacy budgets, under the hidden-state analysis.

show abstract

Section: Experiments Settingmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Differentially Private Learning Needs Hidden State (Or Much Faster Convergence)

Ye¹,

Shokri²

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…For future directions, it is of interest to extend the connection between DP-SGD and DP-SGLD to a more general class, DP-SG-MCMC (stochastic gradient Markov chain Monte Carlo), so as to accelerate the convergence of Bayesian gradient methods. Particularly, the convergence (especially the rate of convergence), the generalization, and the calibration behaviors of DP-BNNs needs more investigation from the theoretical viewpoint, similar to the analysis of DP linear regression [54] and DP deep learning [10].…”

Section: Discussionmentioning

confidence: 99%

Differentially Private Bayesian Neural Networks on Accuracy, Privacy and Reliability

Zhang¹,

Bu²,

Chen³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Bayesian neural network (BNN) allows for uncertainty quantification in prediction, offering an advantage over regular neural networks that has not been explored in the differential privacy (DP) framework. We fill this important gap by leveraging recent development in Bayesian deep learning and privacy accounting to offer a more precise analysis of the trade-off between privacy and accuracy in BNN. We propose three DP-BNNs that characterize the weight uncertainty for the same network architecture in distinct ways, namely DP-SGLD (via the noisy gradient method), DP-BBP (via changing the parameters of interest) and DP-MC Dropout (via the model architecture). Interestingly, we show a new equivalence between DP-SGD and DP-SGLD, implying that some non-Bayesian DP training naturally allows for uncertainty quantification. However, the hyperparameters such as learning rate and batch size, can have different or even opposite effects in DP-SGD and DP-SGLD.Extensive experiments are conducted to compare DP-BNNs, in terms of privacy guarantee, prediction accuracy, uncertainty quantification, calibration, computation speed, and generalizability to network architecture. As a result, we observe a new tradeoff between the privacy and the reliability. When compared to non-DP and non-Bayesian approaches, DP-SGLD is remarkably accurate under strong privacy guarantee, demonstrating the great potential of DP-BNN in real-world tasks.

show abstract

“…randomization is the primary protection adopted by Differential Privacy and its variants. Despite of its simplicity and popularity, this approach inevitably leads to compromised performance in terms of slow convergence, low model performance and loose privacy guarantee as documented in [3,28,4] etc.. On the other hand, privacy protection provided by Homomorphic Encryption (HE) is in principle more secure than DP [14]. However, computational and communication overhead incurred by HE are orders of magnitude higher than that of randomization approach adopted in DP.…”

Section: Related Workmentioning

confidence: 99%

“…It must be noted that the modeling of these two adversarial tasks in the unified Bayesian Privacy framework is well-justified, since estimated privacy guarantee without considering leakage attacks are too loose to provide accurate accounts of information leakage. Indeed, the deficiency of Differential Privacy framework is such an example as documented in the literature [28,3,4] etc. and demonstrated by experimental results in the present paper.…”

Section: Bayesian Privacymentioning

confidence: 99%

Federated Deep Learning with Bayesian Privacy

Gu,

Fan,

et al. 2021

Preprint

View full text Add to dashboard Cite

Federated learning (FL) aims to protect data privacy by cooperatively learning a model without sharing private data among users. For Federated Learning of Deep Neural Network with billions of model parameters, existing privacy-preserving solutions are unsatisfactory. Homomorphic encryption (HE) based methods provide secure privacy protections but suffer from extremely high computational and communication overheads rendering it almost useless in practice [14,16,32]. Deep learning with Differential Privacy (DP) [5] was implemented as a practical learning algorithm at a manageable cost in complexity [1]. However, DP is vulnerable to aggressive Bayesian restoration attacks as disclosed in the literature [12,34,35,31] and demonstrated in experimental results of this work. To address the aforementioned perplexity, we propose a novel Bayesian Privacy (BP) framework which enables Bayesian restoration attacks to be formulated as the probability of reconstructing private data from observed public information. Specifically, the proposed BP framework accurately quantifies privacy loss by Kullback-Leibler (KL) Divergence between the prior distribution about the privacy data and the posterior distribution of restoration private data conditioning on exposed information. To our best knowledge, this Bayesian Privacy analysis is the first to provides theoretical justification of secure privacy-preserving capabilities against Bayesian restoration attacks. As a concrete use case, we demonstrate that a novel federated deep learning method using private passport layers is able to simultaneously achieve high model performance, privacy-preserving capability and low computational complexity. Theoretical analysis is in accordance with empirical measurements of information leakage extensively experimented with a variety of DNN networks on image classification MNIST, CIFAR10, and CIFAR100 datasets.

show abstract

On the Convergence and Calibration of Deep Learning with Differential Privacy

Cited by 8 publications

References 43 publications

Differentially Private Learning Needs Hidden State (Or Much Faster Convergence)

Differentially Private Learning Needs Hidden State (Or Much Faster Convergence)

Differentially Private Bayesian Neural Networks on Accuracy, Privacy and Reliability

Federated Deep Learning with Bayesian Privacy

Contact Info

Product

Resources

About