2021
DOI: 10.48550/arxiv.2106.07830
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

On the Convergence and Calibration of Deep Learning with Differential Privacy

Abstract: In deep learning with differential privacy (DP), the neural network achieves the privacy usually at the cost of slower convergence (and thus lower performance) than its non-private counterpart. This work gives the first convergence analysis of the DP deep learning, through the lens of training dynamics and the neural tangent kernel (NTK). Our convergence theory successfully characterizes the effects of two key components in the DP training: the per-sample clipping (flat or layerwise) and the noise addition. Ou… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
15
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 8 publications
(16 citation statements)
references
References 43 publications
1
15
0
Order By: Relevance
“…4 . The feature clipping technique is different from the DP-SGD [1] algorithm that only requires per-example gradient clipping, and is also different from Bu et al [9] that uses global gradient clipping. The major reason that we use data feature clipping (besides per-example gradient clipping), is for ensuring smoothness of the logistic regression loss function (by Proposition 6.1), which is a necessary condition for applying our privacy bound Theorem 4.4.…”
Section: Experiments Settingmentioning
confidence: 99%
See 1 more Smart Citation
“…4 . The feature clipping technique is different from the DP-SGD [1] algorithm that only requires per-example gradient clipping, and is also different from Bu et al [9] that uses global gradient clipping. The major reason that we use data feature clipping (besides per-example gradient clipping), is for ensuring smoothness of the logistic regression loss function (by Proposition 6.1), which is a necessary condition for applying our privacy bound Theorem 4.4.…”
Section: Experiments Settingmentioning
confidence: 99%
“…For a vast majority of learning tasks (such as private image classification), composition-based privacy analysis DP-SGD [1] is the mainstream analysis method, mainly due to its simplicity and broad applicability. However, private learning via DP-SGD usually suffers from a slow convergence [9], under which the privacy loss is large due to the composition of the privacy loss over a large number of iterations. This, in turn, results in overestimating the magnitude of additive noise needed for differentially private training, and worsens the privacy-accuracy trade-off of the training algorithm.…”
Section: Introductionmentioning
confidence: 99%
“…For future directions, it is of interest to extend the connection between DP-SGD and DP-SGLD to a more general class, DP-SG-MCMC (stochastic gradient Markov chain Monte Carlo), so as to accelerate the convergence of Bayesian gradient methods. Particularly, the convergence (especially the rate of convergence), the generalization, and the calibration behaviors of DP-BNNs needs more investigation from the theoretical viewpoint, similar to the analysis of DP linear regression [54] and DP deep learning [10].…”
Section: Discussionmentioning
confidence: 99%
“…randomization is the primary protection adopted by Differential Privacy and its variants. Despite of its simplicity and popularity, this approach inevitably leads to compromised performance in terms of slow convergence, low model performance and loose privacy guarantee as documented in [3,28,4] etc.. On the other hand, privacy protection provided by Homomorphic Encryption (HE) is in principle more secure than DP [14]. However, computational and communication overhead incurred by HE are orders of magnitude higher than that of randomization approach adopted in DP.…”
Section: Related Workmentioning
confidence: 99%
“…It must be noted that the modeling of these two adversarial tasks in the unified Bayesian Privacy framework is well-justified, since estimated privacy guarantee without considering leakage attacks are too loose to provide accurate accounts of information leakage. Indeed, the deficiency of Differential Privacy framework is such an example as documented in the literature [28,3,4] etc. and demonstrated by experimental results in the present paper.…”
Section: Bayesian Privacymentioning
confidence: 99%