Matthew Nokleby scite author profile

Large datasets often have unreliable labels-such as those obtained from Amazon's Mechanical Turk or social media platforms-and classifiers trained on mislabeled datasets often exhibit poor performance. We present a simple, effective technique for accounting for label noise when training deep neural networks. We augment a standard deep network with a softmax layer that models the label noise statistics. Then, we train the deep network and noise model jointly via endto-end stochastic gradient descent on the (perhaps mislabeled) dataset. The augmented model is overdetermined, so in order to encourage the learning of a non-trivial noise model, we apply dropout regularization to the weights of the noise model during training. Numerical experiments on noisy versions of the CIFAR-10 and MNIST datasets show that the proposed dropout technique outperforms state-of-the-art methods.The previous decade has witnessed swift advances in the performance of deep neural networks for supervised image classification and recognition. State-of-the-art performance requires large datasets, such as the 10,000,000 hand-labeled images comprising the ImageNet dataset [1], [2]. Large datasets suffer from noise, not only in the images themselves, but also in their associated labels. Researchers often resort to non-expert sources such as Amazon's Mechanical Turk or tags from social networking sites to label massive datasets, resulting in unreliable labels. Furthermore, the distinction between class labels is not always precise, and even experts may disagree on the correct label of an image. Regardless its source, the resulting noise can drastically degrade learning performance [3], [4].Learning with noisy labels has been studied previously, but not extensively. Techniques for training support vector machines, K-nearest neighbor classifiers, and logistic regression models with label noise are presented in [5], [6]. Further, [6] gives sample complexity bounds in the presence of label noise. Only a few papers consider deep learning with noisy labels. An early work is [7], which studied symmetric label noise in neural networks. Binary classification with label noise was studied in [8]. In [9], techniques for multi-class learning and general label noise models are presented. This approach adds an extra linear layer, intended to model the label noise, to the conventional convolutional neural network (CNN) architecture. In a similar vein, the work of [10] uses self-learning techniques to "bootstrap" the simultaneous learning of a deep network and a label noise model.In this paper, we present a simple, effective approach to learning deep neural networks from datasets corrupted by label flips. We augment an arbitrary deep architecture with a softmax layer that characterizes the pairwise label flip probabilities. We learn jointly the parameters of the deep network and the noise model simultaneously using standard stochastic gradient descent. To ensure that the network learns an accurate noise model-instead of fitting the deep network to the noisy labels...

show abstract

Deep Neural Networks for Corrupted Labels

Jindal

Nokleby

Pressel³

et al. 2019

View full text Add to dashboard Cite

Discrimination on the grassmann manifold: Fundamental limits of subspace classifiers

Nokleby

Rodrigues²,

Calderbank

2014

View full text Add to dashboard Cite

We present fundamental limits on the reliable classification of linear and affine subspaces from noisy, linear features. Drawing an analogy between discrimination among subspaces and communication over vector wireless channels, we propose two Shannon-inspired measures to characterize asymptotic classifier performance. First, we define the classification capacity, which characterizes necessary and sufficient conditions for the misclassification probability to vanish as the signal dimension, the number of features, and the number of subspaces to be discerned all approach infinity. Second, we define the diversity-discrimination tradeoff which, by analogy with the diversitymultiplexing tradeoff of fading vector channels, characterizes relationships between the number of discernible subspaces and the misclassification probability as the noise power approaches zero. We derive upper and lower bounds on these measures which are tight in many regimes. Numerical results, including a face recognition application, validate the results in practice.

show abstract

Optimizing Taxi Carpool Policies via Reinforcement Learning and Spatio-Temporal Mining

Jindal

Qin²,

Chen

et al. 2018

View full text Add to dashboard Cite

In this paper, we develop a reinforcement learning (RL) based system to learn an effective policy for carpooling that maximizes transportation efficiency so that fewer cars are required to fulfill the given amount of trip demand. For this purpose, first, we develop a deep neural network model, called ST-NN (Spatio-Temporal Neural Network), to predict taxi trip time from the raw GPS trip data. Secondly, we develop a carpooling simulation environment for RL training, with the output of ST-NN and using the NYC taxi trip dataset. In order to maximize transportation efficiency and minimize traffic congestion, we choose the effective distance covered by the driver on a carpool trip as the reward. Therefore, the more effective distance a driver achieves over a trip (i.e. to satisfy more trip demand) the higher the efficiency and the less will be the traffic congestion. We compared the performance of RL learned policy to a fixed policy (which always accepts carpool) as a baseline and obtained promising results that are interpretable and demonstrate the advantage of our RL approach. We also compare the performance of ST-NN to that of state-of-the-art travel time estimation methods and observe that ST-NN significantly improves the prediction performance and is more robust to outliers.

show abstract

User Cooperation for Energy-Efficient Cellular Communications

Nokleby

Aazhang

2010

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Matthew Nokleby

Learning Deep Networks from Noisy Labels with Dropout Regularization

Deep Neural Networks for Corrupted Labels

Discrimination on the grassmann manifold: Fundamental limits of subspace classifiers

Optimizing Taxi Carpool Policies via Reinforcement Learning and Spatio-Temporal Mining

User Cooperation for Energy-Efficient Cellular Communications

Contact Info

Product

Resources

About