Soham De scite author profile

Differential Privacy (DP) provides a formal privacy guarantee preventing adversaries with access to a machine learning model from extracting information about individual training points. Differentially Private Stochastic Gradient Descent (DP-SGD), the most popular DP training method, realizes this protection by injecting noise during training. However previous works have found that DP-SGD often leads to a significant degradation in performance on standard image classification benchmarks. Furthermore, some authors have postulated that DP-SGD inherently performs poorly on large models, since the norm of the noise required to preserve privacy is proportional to the model dimension. In contrast, we demonstrate that DP-SGD on over-parameterized models can perform significantly better than previously thought. Combining careful hyper-parameter tuning with simple techniques to ensure signal propagation and improve the convergence rate, we obtain a new SOTA on CIFAR-10 of 81.4% under (8, 10 −5 )-DP using a 40-layer Wide-ResNet, improving over the previous SOTA of 71.7%. When finetuning a pre-trained 200-layer Normalizer-Free ResNet, we achieve a remarkable 77.1% top-1 accuracy on ImageNet under (1, 8 • 10 −7 )-DP, and achieve 81.1% under (8, 8 • 10 −7 )-DP. This markedly exceeds the previous SOTA of 47.9% under a larger privacy budget of (10, 10 −6 )-DP. We believe our results are a significant step towards closing the accuracy gap between private and non-private image classification.

show abstract

Tipping Points for Norm Change in Human Cultures

Nau

Pan

et al. 2018

View full text Add to dashboard Cite

Humans interact with each other on a daily basis by developing and maintaining various social norms and it is critical to form a deeper understanding of how such norms develop, how they change, and how fast they change. In this work, we develop an evolutionary game-theoretic model based on research in cultural psychology that shows that humans in various cultures differ in their tendencies to conform with those around them. Using this model, we analyze the evolutionary relationships between the tendency to conform and how quickly a population reacts when conditions make a change in norm desirable. Our analysis identifies conditions when a tipping point is reached in a population, causing norms to change rapidly.

show abstract

Drawing Multiple Augmentation Samples Per Image During Training Efficiently Decreases Test Error

Fort¹,

Brock²,

Pascanu³

et al. 2021

Preprint

View full text Add to dashboard Cite

In computer vision, it is standard practice to draw a single sample from the data augmentation procedure for each unique image in the mini-batch, however it is not clear whether this choice is optimal for generalization. In this work, we provide a detailed empirical evaluation of how the number of augmentation samples per unique image influences performance on held out data. Remarkably, we find that drawing multiple samples per image consistently enhances the test accuracy achieved for both small and large batch training, despite reducing the number of unique training examples in each mini-batch. This benefit arises even when different augmentation multiplicities perform the same number of parameter updates and gradient evaluations. Our results suggest that, although the variance in the gradient estimate arising from subsampling the dataset has an implicit regularization benefit, the variance which arises from the data augmentation process harms test accuracy. By applying augmentation multiplicity to the recently proposed NFNet model family, we achieve a new ImageNet state of the art of 86.8% top-1 w/o extra data. * Work performed while interning at DeepMind. Preprint. Under review.

show abstract

Characterizing signal propagation to close the performance gap in unnormalized ResNets

Brock¹,

De²,

Smith³

2021

Preprint

View full text Add to dashboard Cite

Batch Normalization is a key component in almost all state-of-the-art image classifiers, but it also introduces practical challenges: it breaks the independence between training examples within a batch, can incur compute and memory overhead, and often results in unexpected bugs. Building on recent theoretical analyses of deep ResNets at initialization, we propose a simple set of analysis tools to characterize signal propagation on the forward pass, and leverage these tools to design highly performant ResNets without activation normalization layers. Crucial to our success is an adapted version of the recently proposed Weight Standardization. Our analysis tools show how this technique preserves the signal in networks with ReLU or Swish activation functions by ensuring that the per-channel activation means do not grow with depth. Across a range of FLOP budgets, our networks attain performance competitive with the state-of-the-art EfficientNets on ImageNet.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Soham De

The loosening of American culture over 200 years is associated with a creativity–order trade-off

Unlocking High-Accuracy Differentially Private Image Classification through Scale

Tipping Points for Norm Change in Human Cultures

Drawing Multiple Augmentation Samples Per Image During Training Efficiently Decreases Test Error

Characterizing signal propagation to close the performance gap in unnormalized ResNets

Contact Info

Product

Resources

About