Data augmentation is a highly effective approach for improving performance in deep neural networks. The standard view is that it creates an enlarged dataset by adding synthetic data, which raises a problem when combining it with Bayesian inference: how much data are we really conditioning on? This question is particularly relevant to recent observations linking data augmentation to the cold posterior effect. We investigate various principled ways of finding a log-likelihood for augmented datasets. Our approach prescribes augmenting the same underlying image multiple times, both at test and train-time, and averaging either the logits or the predictive probabilities. Empirically, we observe the best performance with averaging probabilities. While there are interactions with the cold posterior effect, neither averaging logits or averaging probabilities eliminates it. IntroductionData augmentation (Shorten & Khoshgoftaar, 2019) is a fundamental technique for obtaining high performance in modern neural networks (NNs). In computer vision, data augmentation involves creating synthetic training examples by making small modifications, such as a rotation or crop, to the input image.At the same time, Bayesian inference allows us to reason about uncertainty in neural network weights (MacKay, 1992;Welling & Teh, 2011;Blundell et al., 2015;Fortuin, 2021) given limited data. Bayesian inference is particularly important in safety-critical settings such as self-driving cars or medical imaging where it is crucial to be able to hand over to a human when uncertainty is too large. * equal contribution † equal contributionPreprint. Under review.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.