Sterile neutrinos

The field of meta-learning, or learning-to-learn, has seen a dramatic rise in interest in recent years. Contrary to conventional approaches to AI where tasks are solved from scratch using a fixed learning algorithm, meta-learning aims to improve the learning algorithm itself, given the experience of multiple learning episodes. This paradigm provides an opportunity to tackle many conventional challenges of deep learning, including data and computation bottlenecks, as well as generalization. This survey describes the contemporary meta-learning landscape. We first discuss definitions of meta-learning and position it with respect to related fields, such as transfer learning and hyperparameter optimization. We then propose a new taxonomy that provides a more comprehensive breakdown of the space of meta-learning methods today. We survey promising applications and successes of meta-learning such as few-shot learning and reinforcement learning. Finally, we discuss outstanding challenges and promising areas for future research.

show abstract

Meta-Learning in Neural Networks: A Survey

Hospedales¹,

Antoniou²,

Micaelli³

et al. 2020

Preprint

209

276

View full text Add to dashboard Cite

Accurate prediction of X-ray pulse properties from a free-electron laser using machine learning

et al. 2017

View full text Add to dashboard Cite

Free-electron lasers providing ultra-short high-brightness pulses of X-ray radiation have great potential for a wide impact on science, and are a critical element for unravelling the structural dynamics of matter. To fully harness this potential, we must accurately know the X-ray properties: intensity, spectrum and temporal profile. Owing to the inherent fluctuations in free-electron lasers, this mandates a full characterization of the properties for each and every pulse. While diagnostics of these properties exist, they are often invasive and many cannot operate at a high-repetition rate. Here, we present a technique for circumventing this limitation. Employing a machine learning strategy, we can accurately predict X-ray properties for every shot using only parameters that are easily recorded at high-repetition rate, by training a model on a small set of fully diagnosed pulses. This opens the door to fully realizing the promise of next-generation high-repetition rate X-ray lasers.

show abstract

Zero-shot Knowledge Transfer via Adversarial Belief Matching

Micaelli¹,

Storkey²

2019

Preprint

View full text Add to dashboard Cite

Performing knowledge transfer from a large teacher network to a smaller student is a popular task in modern deep learning applications. However, due to growing dataset sizes and stricter privacy regulations, it is increasingly common not to have access to the data that was used to train the teacher. We propose a novel method which trains a student to match the predictions of its teacher without using any data or metadata. We achieve this by training an adversarial generator to search for images on which the student poorly matches the teacher, and then using them to train the student. Our resulting student closely approximates its teacher for simple datasets like SVHN, and on CIFAR10 we improve on the stateof-the-art for few-shot distillation (with 100 images per class), despite using no data. Finally, we also propose a metric to quantify the degree of belief matching between teacher and student in the vicinity of decision boundaries, and observe a significantly higher match between our zero-shot student and the teacher, than between a student distilled with real data and the teacher. Code is available at: https://github.com/polo5/ZeroShotKnowledgeTransfer

show abstract

Gradient-based Hyperparameter Optimization Over Long Horizons

Micaelli¹,

Storkey²

2020

Preprint

View full text Add to dashboard Cite

Gradient-based hyperparameter optimization is an attractive way to perform metalearning across a distribution of tasks, or improve the performance of an optimizer on a single task. However, this approach has been unpopular for tasks requiring long horizons (many gradient steps), due to memory scaling and gradient degradation issues. A common workaround is to learn hyperparameters online or split the horizon into smaller chunks. However, this introduces greediness which comes with a large performance drop, since the best local hyperparameters can make for poor global solutions. In this work, we enable non-greediness over long horizons with a two-fold solution. First, we share hyperparameters that are contiguous in time, and show that this drastically mitigates gradient degradation issues. Then, we derive a forward-mode differentiation algorithm for the popular momentumbased SGD optimizer, which allows for a memory cost that is constant with horizon size. When put together, these solutions allow us to learn hyperparameters without any prior knowledge. Compared to the baseline of hand-tuned off-theshelf hyperparameters, our method compares favorably on simple datasets like SVHN. On CIFAR-10 we match the baseline performance, and demonstrate for the first time that learning rate, momentum and weight decay schedules can be learned with gradients on a dataset of this size. Code is available at: https: //github.com/polo5/NonGreedyGradientHPO Preprint. Under review.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Paul Micaelli

Meta-Learning in Neural Networks: A Survey

Meta-Learning in Neural Networks: A Survey

Accurate prediction of X-ray pulse properties from a free-electron laser using machine learning

Zero-shot Knowledge Transfer via Adversarial Belief Matching

Gradient-based Hyperparameter Optimization Over Long Horizons

Contact Info

Product

Resources

About