Stochastic gradient descent with differentially private updates

Song, Shuang; Chaudhuri, Kamalika; Sarwate, Anand D.

doi:10.1109/globalsip.2013.6736861

Cited by 448 publications

(388 citation statements)

References 10 publications

Supporting

Mentioning

385

Contrasting

Order By: Relevance

“…The example above is typical of what can be expressed in our language, and many variants of machine learning techniques that rely on gradient descent (e.g., as in [Goodfellow et al 2016], and commonly used in systems like TensorFlow) are in scope as well. For instance, there is no difficulty in expressing optimization with momentum, or differentially private stochastic gradient descent (e.g., [Abadi et al 2016b;Song et al 2013]). Probabilistic choice may be treated via random number generators, as is done in practice.…”

Section: Fig 1 Typing Rulesmentioning

confidence: 99%

A simple differentiable programming language

Abadi

Plotkin

2019

Proc. ACM Program. Lang.

View full text Add to dashboard Cite

Automatic differentiation plays a prominent role in scientific computing and in modern machine learning, often in the context of powerful programming systems. The relation of the various embodiments of automatic differentiation to the mathematical notion of derivative is not always entirely clear-discrepancies can arise, sometimes inadvertently. In order to study automatic differentiation in such programming contexts, we define a small but expressive programming language that includes a construct for reverse-mode differentiation. We give operational and denotational semantics for this language. The operational semantics employs popular implementation techniques, while the denotational semantics employs notions of differentiation familiar from real analysis. We establish that these semantics coincide.

show abstract

Section: Fig 1 Typing Rulesmentioning

confidence: 99%

A simple differentiable programming language

Abadi

Plotkin

2019

Proc. ACM Program. Lang.

View full text Add to dashboard Cite

show abstract

“…Addressing the problems mentioned above, the Stochastic Channel-Based Federated Learning (SCBF) realizes the function of differential privacy preserving by protecting the two sources of potential privacy leakage from federated learning: the actual values of uploaded gradients from the local participants and the mechanism these gradients are chosen ??. By setting a threshold to select the parameters of gradients channel-wise, the actual values uploaded to the server are stored in a sparse tensor that processed from Stochastic Gradient Descent (SGD), a stochastic training process which has already been used for many privacy preserving cases [8], [9]. Besides, the participant could independently choose the update rate for their models, thus making it hard to track the selection of the channels that used for update, especially when they are trained individually using different datasets through stochastic ways.…”

Section: B Differential Privacy Preservingmentioning

confidence: 99%

Privacy Preserving: Stochastic Channel-Based Federated Learning with Neural Network Pruning (Preprint)

Shao¹,

He²,

Liu³

et al. 2019

Preprint

View full text Add to dashboard Cite

BACKGROUND Artificial neural network has achieved unprecedented success in a wide variety of domains such as classifying, predicting and recognizing objects. This success depends on the availability of massive and representative datasets. However, data collection is often prevented by privacy concerns and people want to take control over their sensitive information during both training and using processes. OBJECTIVE To address this problem, we propose a privacy-preserving method for the distributed system. The proposed method, Stochastic Channel-Based Federated Learning (SCBF), enables the participants to train a high-performance model cooperatively without sharing their inputs. METHODS Specifically, we design, implement and evaluate a channel-based update algorithm for the central server in a distributed system. The update algorithm will select the channels with regard to the most active features in a training loop and upload them as learned information from local datasets. A pruning process, which serves as a model accelerator, is applied to the algorithm based on the validation set. RESULTS We construct a distributed system consisting of 5 clients and 1 server. Our trials show that the Stochastic Channel-Based Federated Learning method can achieve an AUCROC of 0.9776 and an AUCPR of 0.9695 with 10% channels shared with the server. Compared with Federated Averaging algorithm, the proposed method achieves 0.05388 higher in AUCROC and 0.09695 higher in AUCPR. In addition, our experiment shows that 57% of the time is saved by the pruning process with only a reduction of 0.0047 in AUCROC performance and a reduction of 0.0068 in AUCPR. CONCLUSIONS In the experiment, our model presents better performances and higher saturating speed than the Federated Averaging method, which reveals all the parameters of local models to the server. We also demonstrate that the saturating rate of performance could be promoted by introducing a pruning process and further improvement could be achieved by tuning the pruning rate.

show abstract

“…Privacy is achieved in part by assuming that devices do not share raw data either with each other or with any external party. In addition, theoretical notions of privacy such as differential privacy can be incorporated in this framework as well [4], [6], [7]. In-place data processing could also provide better scalability for certain tasks in the limit of extremely large systems compared to cloud-based solutions by exploiting local resources and networks, as proposed e.g.…”

Section: Introductionmentioning

confidence: 99%

Dimension Reduction Methods for Collaborative Mobile Gossip Learning

Berta

Hegedűs

Jelasity

2016

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)

View full text Add to dashboard Cite

Decentralized learning algorithms are very sensitive to the size of the raw data records due to the resulting large communication cost. This can, in the worst case, even make decentralized learning infeasible. Dimension reduction is a key technique to compress data and to obtain small models. In this paper, we propose a number of robust and efficient decentralized approaches to dimension reduction in the system model where each network node holds only one data record. These algorithms build on searching for good random projections. We present a thorough experimental comparison of the proposed algorithms and compare them with a variant of distributed singular value decomposition (SVD), a state-of-the-art algorithm for dimension reduction. We base our experiments on a trace of real mobile phone usage. We conclude that our method based on selecting good random projections is preferable and provides good quality results when the output is required on a very short timescale, within tens of minutes. We also present a hybrid method that combines the advantages of random projections and SVD. We demonstrate that the hybrid method offers good performance over all timescales.

show abstract

Stochastic gradient descent with differentially private updates

Cited by 448 publications

References 10 publications

A simple differentiable programming language

A simple differentiable programming language

Privacy Preserving: Stochastic Channel-Based Federated Learning with Neural Network Pruning (Preprint)

Dimension Reduction Methods for Collaborative Mobile Gossip Learning

Contact Info

Product

Resources

About