Yu Sun scite author profile

Very deep convolutional networks with hundreds of layers have led to significant reductions in error on competitive benchmarks. Although the unmatched expressiveness of the many layers can be highly desirable at test time, training very deep networks comes with its own set of challenges. The gradients can vanish, the forward flow often diminishes, and the training time can be painfully slow. To address these problems, we propose stochastic depth, a training procedure that enables the seemingly contradictory setup to train short networks and use deep networks at test time. We start with very deep networks but during training, for each mini-batch, randomly drop a subset of layers and bypass them with the identity function. This simple approach complements the recent success of residual networks. It reduces training time substantially and improves the test error significantly on almost all data sets that we used for evaluation. With stochastic depth we can increase the depth of residual networks even beyond 1200 layers and still yield meaningful improvements in test error (4.91% on CIFAR-10).

show abstract

On Calibration of Modern Neural Networks

Guo

Pleiss

Sun

et al. 2017

Preprint

229

301

View full text Add to dashboard Cite

Confidence calibration -the problem of predicting probability estimates representative of the true correctness likelihood -is important for classification models in many applications. We discover that modern neural networks, unlike those from a decade ago, are poorly calibrated. Through extensive experiments, we observe that depth, width, weight decay, and Batch Normalization are important factors influencing calibration. We evaluate the performance of various post-processing calibration methods on state-ofthe-art architectures with image and document classification datasets. Our analysis and experiments not only offer insights into neural network learning, but also provide a simple and straightforward recipe for practical settings: on most datasets, temperature scaling -a singleparameter variant of Platt Scaling -is surprisingly effective at calibrating predictions.

show abstract

Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification

Chen

Sun

Athiwaratkun

et al. 2018

TACL

256

215

View full text Add to dashboard Cite

In recent years great success has been achieved in sentiment classification for English, thanks in part to the availability of copious annotated resources. Unfortunately, most languages do not enjoy such an abundance of labeled data. To tackle the sentiment classification problem in low-resource languages without adequate annotated data, we propose an Adversarial Deep Averaging Network (ADAN 1 ) to transfer the knowledge learned from labeled data on a resource-rich source language to low-resource languages where only unlabeled data exists. ADAN has two discriminative branches: a sentiment classifier and an adversarial language discriminator. Both branches take input from a shared feature extractor to learn hidden representations that are simultaneously indicative for the classification task and invariant across languages. Experiments on Chinese and Arabic sentiment classification demonstrate that ADAN significantly outperforms state-of-the-art systems.

show abstract

Automatic Image-Based Plant Disease Severity Estimation Using Deep Learning

Wang

Sun

Wang

2017

Computational Intelligence and Neuroscience

521

212

View full text Add to dashboard Cite

Automatic and accurate estimation of disease severity is essential for food security, disease management, and yield loss prediction. Deep learning, the latest breakthrough in computer vision, is promising for fine-grained disease severity classification, as the method avoids the labor-intensive feature engineering and threshold-based segmentation. Using the apple black rot images in the PlantVillage dataset, which are further annotated by botanists with four severity stages as ground truth, a series of deep convolutional neural networks are trained to diagnose the severity of the disease. The performances of shallow networks trained from scratch and deep models fine-tuned by transfer learning are evaluated systemically in this paper. The best model is the deep VGG16 model trained with transfer learning, which yields an overall accuracy of 90.4% on the hold-out test set. The proposed deep learning model may have great potential in disease control for modern agriculture.

show abstract

An Online Plug-and-Play Algorithm for Regularized Image Reconstruction

Sun

Wohlberg

Kamilov

2019

IEEE Trans. Comput. Imaging

184

159

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yu Sun

Deep Networks with Stochastic Depth

On Calibration of Modern Neural Networks

Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification

Automatic Image-Based Plant Disease Severity Estimation Using Deep Learning

An Online Plug-and-Play Algorithm for Regularized Image Reconstruction

Contact Info

Product

Resources

About