Confidence calibration -the problem of predicting probability estimates representative of the true correctness likelihood -is important for classification models in many applications. We discover that modern neural networks, unlike those from a decade ago, are poorly calibrated. Through extensive experiments, we observe that depth, width, weight decay, and Batch Normalization are important factors influencing calibration. We evaluate the performance of various post-processing calibration methods on state-ofthe-art architectures with image and document classification datasets. Our analysis and experiments not only offer insights into neural network learning, but also provide a simple and straightforward recipe for practical settings: on most datasets, temperature scaling -a singleparameter variant of Platt Scaling -is surprisingly effective at calibrating predictions.
Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper, we embrace this observation and introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion. Whereas traditional convolutional networks with L layers have L connections-one between each layer and its subsequent layer-our network has L(L+1) 2 direct connections. For each layer, the feature-maps of all preceding layers are used as inputs, and its own feature-maps are used as inputs into all subsequent layers. DenseNets have several compelling advantages: they alleviate the vanishing-gradient problem, encourage feature reuse and substantially improve parameter efficiency. We evaluate our proposed architecture on four highly competitive object recognition benchmark tasks (CIFAR-10, CIFAR-100, SVHN, and ImageNet). DenseNets obtain significant improvements over the state-of-the-art on most of them, whilst requiring less parameters and computation to achieve high performance.
We propose Deep Feature Interpolation (DFI), a new datadriven baseline for automatic high-resolution image transformation. As the name suggests, DFI relies only on simple linear interpolation of deep convolutional features from pre-trained convnets. We show that despite its simplicity, DFI can perform high-level semantic transformations like "make older/younger", "make bespectacled", "add smile", among others, surprisingly well-sometimes even matching or outperforming the state-of-the-art. This is particularly unexpected as DFI requires no specialized network architecture or even any deep network to be trained for these tasks. DFI therefore can be used as a new baseline to evaluate more complex algorithms and provides a practical answer to the question of which image transformation tasks are still challenging after the advent of deep learning.
The DenseNet architecture [9] is highly computationally efficient as a result of feature reuse. However, a naïve DenseNet implementation can require a significant amount of GPU memory: If not properly managed, pre-activation batch normalization [7] and contiguous convolution operations can produce feature maps that grow quadratically with network depth. In this technical report, we introduce strategies to reduce the memory consumption of DenseNets during training. By strategically using shared memory allocations, we reduce the memory cost for storing feature maps from quadratic to linear. Without the GPU memory bottleneck, it is now possible to train extremely deep DenseNets. Networks with 14M parameters can be trained on a single GPU, up from 4M . A 264-layer DenseNet (73M parameters), which previously would have been infeasible to train, can now be trained on a single workstation with 8 NVIDIA Tesla M40 GPUs. On the ImageNet ILSVRC classification dataset, this large DenseNet obtains a state-of-the-art single-crop top-1 error of 20.26%.
Current generation general circulation models (GCMs) simulate synoptic-scale climate state variables such as geopotential heights, specific humidity, and integrated vapor transport (IVT) more reliably than mesoscale precipitation. Statistical downscaling methods that condition precipitation on GCM-based, synoptic-scale climate features have shown promise in the reproduction of local precipitation. However, current approaches to climate-state-informed downscaling impose some limitations on the skill of precipitation reproduction, including hard clustering of climate modes into a discrete set of states, utilization of numerical clustering methodologies poorly suited to nonnormal data, and a tendency to focus on relationships to a limited set of large-scale climate modes. This study presents a methodology based on emerging machine learning techniques to develop global approximators of regional precipitation and discharge extremes given a suite of synoptic-scale climate state variables. Archetypal analysis is first used to define regional modes of winter and summer extreme precipitation and discharge across the eastern contiguous United States. A 2D convolution neural network (NN) is then used to predict the co-occurrence of the archetypes using 300- and 700-hPa geopotential heights, 300- and 700-hPa specific humidity, and IVT. Results suggest that 300-hPa geopotential height, 700-hPa specific humidity, and IVT yield the most reliable predictions, although with some important differences by season and region. Finally, we demonstrate that the trained activations of NN convolutional layers can be used to infer the causal pathways between synoptic-scale climate features and regional extremes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.