DisturbLabel: Regularizing CNN on the Loss Layer

Xie, Lingxi; Wang, Jingdong; Wei, Zhen; Wang, Meng; Tian, Qi

doi:10.1109/cvpr.2016.514

Cited by 202 publications

(159 citation statements)

References 22 publications

Supporting

Mentioning

158

Contrasting

Order By: Relevance

“…For the cleanliness test, we replaced the labes to a random incorrect one for 5%, 10%, 15% and 32% of the examples. The labels are fixed, unlike the recent work on disturbing labels as a regularization method [53].…”

Section: Methodsmentioning

confidence: 99%

Systematic evaluation of convolution neural network advances on the Imagenet

Mishkin

Sergievskiy

Matas

2017

Computer Vision and Image Understanding

265

144

View full text Add to dashboard Cite

The paper systematically studies the impact of a range of recent advances in CNN architectures and learning methods on the object categorization (ILSVRC) problem. The evalution tests the influence of the following choices of the architecture: non-linearity (ReLU, ELU, maxout, compatability with batch normalization), pooling variants (stochastic, max, average, mixed), network width, classifier design (convolutional, fully-connected, SPP), image pre-processing, and of learning parameters: learning rate, batch size, cleanliness of the data, etc.The performance gains of the proposed modifications are first tested individually and then in combination. The sum of individual gains is bigger than the observed improvement when all modifications are introduced, but the "deficit" is small suggesting independence of their benefits.We show that the use of 128x128 pixel images is sufficient to make qualitative conclusions about optimal network structure that hold for the full size Caffe and VGG nets. The results are obtained an order of magnitude faster than with the standard 224 pixel images.

show abstract

Section: Methodsmentioning

confidence: 99%

Systematic evaluation of convolution neural network advances on the Imagenet

Mishkin

Sergievskiy

Matas

2017

Computer Vision and Image Understanding

265

144

View full text Add to dashboard Cite

show abstract

“…In Figure 1 There have been numerous studies to solve either of two issues individually. On the one hand, to reduce the risk of overfitting in deep CNNs, previous research suggests adding appropriate randomness into the training phase [39,42,45]. For example, Dropout [39] adds randomness in activation by randomly discarding the hidden layers' outputs.…”

Section: Introductionmentioning

confidence: 99%

P3SGD: Patient Privacy Preserving SGD for Regularizing Deep CNNs in Pathological Image Classification

Zhao²,

Sun

et al. 2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

View full text Add to dashboard Cite

Recently, deep convolutional neural networks (CNNs) have achieved great success in pathological image classification. However, due to the limited number of labeled pathological images, there are still two challenges to be addressed: (1) overfitting: the performance of a CNN model is undermined by the overfitting due to its huge amounts of parameters and the insufficiency of labeled training data.(2) privacy leakage: the model trained using a conventional method may involuntarily reveal the private information of the patients in the training dataset. The smaller the dataset, the worse the privacy leakage.To tackle the above two challenges, we introduce a novel stochastic gradient descent (SGD) scheme, named patient privacy preserving SGD (P3SGD), which performs the model update of the SGD in the patient level via a large-step update built upon each patient's data. Specifically, to protect privacy and regularize the CNN model, we propose to inject the well-designed noise into the updates. Moreover, we equip our P3SGD with an elaborated strategy to adaptively control the scale of the injected noise. To validate the effectiveness of P3SGD, we perform extensive experiments on a real-world clinical dataset and quantitatively demonstrate the superior ability of P3SGD in reducing the risk of overfitting. We also provide a rigorous analysis of the privacy cost under differential privacy. Additionally, we find that the models trained with P3SGD are resistant to the model-inversion attack compared with those trained using non-private SGD.

show abstract

“…For these datasets, we train six networks: (a). a light ConvNet with the same architecture as in [42], (b). the network-innetwork (NIN) [43], (c).…”

Section: Cifar-10/100 and Svhnmentioning

confidence: 99%

“…It can be seen that our method outperforms Distur-bLabel[42] and L-Softmax[15] under the same architectures. Again, EM-softmax[16] achieves a lower error rate 26.86% than ours 25.91% using model ensembling, while we only measure single model performance.…”

mentioning

confidence: 95%

Adversarial Margin Maximization Networks

Yan

Guo²,

Zhang

2021

IEEE Trans. Pattern Anal. Mach. Intell.

View full text Add to dashboard Cite

The tremendous recent success of deep neural networks (DNNs) has sparked a surge of interest in understanding their predictive ability. Unlike the human visual system which is able to generalize robustly and learn with little supervision, DNNs normally require a massive amount of data to learn new concepts. In addition, research works also show that DNNs are vulnerable to adversarial examples-maliciously generated images which seem perceptually similar to the natural ones but are actually formed to fool learning models, which means the models have problem generalizing to unseen data with certain type of distortions. In this paper, we analyze the generalization ability of DNNs comprehensively and attempt to improve it from a geometric point of view. We propose adversarial margin maximization (AMM), a learning-based regularization which exploits an adversarial perturbation as a proxy. It encourages a large margin in the input space, just like the support vector machines. With a differentiable formulation of the perturbation, we train the regularized DNNs simply through back-propagation in an end-to-end manner. Experimental results on various datasets (including MNIST, CIFAR-10/100, SVHN and ImageNet) and different DNN architectures demonstrate the superiority of our method over previous state-of-the-arts. Code and models for reproducing our results will be made publicly available.

show abstract

DisturbLabel: Regularizing CNN on the Loss Layer

Cited by 202 publications

References 22 publications

Systematic evaluation of convolution neural network advances on the Imagenet

Systematic evaluation of convolution neural network advances on the Imagenet

P3SGD: Patient Privacy Preserving SGD for Regularizing Deep CNNs in Pathological Image Classification

Adversarial Margin Maximization Networks

Contact Info

Product

Resources

About