Christopher J. Shallue scite author profile

NASA's Kepler Space Telescope was designed to determine the frequency of Earth-sized planets orbiting Sun-like stars, but these planets are on the very edge of the mission's detection sensitivity. Accurately determining the occurrence rate of these planets will require automatically and accurately assessing the likelihood that individual candidates are indeed planets, even at low signal-to-noise ratios. We present a method for classifying potential planet signals using deep learning, a class of machine learning algorithms that have recently become state-of-theart in a wide variety of tasks. We train a deep convolutional neural network to predict whether a given signal is a transiting exoplanet or a false positive caused by astrophysical or instrumental phenomena. Our model is highly effective at ranking individual candidates by the likelihood that they are indeed planets: 98.8% of the time it ranks plausible planet signals higher than false-positive signals in our test set. We apply our model to a new set of candidate signals that we identified in a search of known Kepler multi-planet systems. We statistically validate two new planets that are identified with high confidence by our model. One of these planets is part of a five-planet resonant chain around Kepler-80, with an orbital period closely matching the prediction by three-body Laplace relations. The other planet orbits Kepler-90, a star that was previously known to host seven transiting planets. Our discovery of an eighth planet brings Kepler-90 into a tie with our Sun as the star known to host the most planets.

show abstract

Identifying Exoplanets with Deep Learning. III. Automated Triage and Vetting of TESS Candidates

Vanderburg

Huang

et al. 2019

View full text Add to dashboard Cite

NASA's Transiting Exoplanet Survey Satellite (TESS) presents us with an unprecedented volume of space-based photometric observations that must be analyzed in an efficient and unbiased manner. With at least ∼1,000,000 new light curves generated every month from full-frame images alone, automated planet candidate identification has become an attractive alternative to human vetting. Here we present a deep learning model capable of performing triage and vetting on TESS candidates. Our model is modified from an existing neural network designed to automatically classify Kepler candidates, and is the first neural network to be trained and tested on real TESS data. In triage mode, our model can distinguish transit-like signals (planet candidates and eclipsing binaries) from stellar variability and instrumental noise with an average precision (the weighted mean of precisions over all classification thresholds) of 97.0% and an accuracy of 97.4%. In vetting mode, the model is trained to identify only planet candidates with the help of newly added scientific domain knowledge, and achieves an average precision of 69.3% and an accuracy of 97.8%. We apply our model on new data from Sector 6, and present 288 new signals that received the highest scores in triage and vetting and were also identified as planet candidates by human vetters. We also provide a homogeneously classified set of TESS candidates suitable for future training.

show abstract

Embedding Text in Hyperbolic Spaces

Dhingra¹,

Shallue²,

Norouzi³

et al. 2018

View full text Add to dashboard Cite

Natural language text exhibits hierarchical structure in a variety of respects. Ideally, we could incorporate our prior knowledge of this hierarchical structure into unsupervised learning algorithms that work on text data. Recent work by Nickel and Kiela (2017) proposed using hyperbolic instead of Euclidean embedding spaces to represent hierarchical data and demonstrated encouraging results when embedding graphs. In this work, we extend their method with a re-parameterization technique that allows us to learn hyperbolic embeddings of arbitrarily parameterized objects. We apply this framework to learn word and sentence embeddings in hyperbolic space in an unsupervised manner from text corpora. The resulting embeddings seem to encode certain intuitive notions of hierarchy, such as wordcontext frequency and phrase constituency. However, the implicit continuous hierarchy in the learned hyperbolic space makes interrogating the model's learned hierarchies more difficult than for models that learn explicit edges between items. The learned hyperbolic embeddings show improvements over Euclidean embeddings in some -but not all -downstream tasks, suggesting that hierarchical organization is more useful for some tasks than others.

show abstract

Measuring the Effects of Data Parallelism on Neural Network Training

Shallue¹,

Lee²,

Antognini³

et al. 2018

Preprint

View full text Add to dashboard Cite

Recent hardware developments have dramatically increased the scale of data parallelism available for neural network training. Among the simplest ways to harness next-generation hardware is to increase the batch size in standard mini-batch neural network training algorithms. In this work, we aim to experimentally characterize the effects of increasing the batch size on training time, as measured by the number of steps necessary to reach a goal out-of-sample error. We study how this relationship varies with the training algorithm, model, and data set, and find extremely large variation between workloads. Along the way, we show that disagreements in the literature on how batch size affects model quality can largely be explained by differences in metaparameter tuning and compute budgets at different batch sizes. We find no evidence that larger batch sizes degrade out-of-sample performance. Finally, we discuss the implications of our results on efforts to train neural networks much faster in the future. Our experimental data is publicly available as a database of 71,638,836 loss measurements taken over the course of training for 168,160 individual models across 35 workloads.

show abstract

Identifying Exoplanets with Deep Learning. II. Two New Super-Earths Uncovered by a Neural Network in K2 Data

Dattilo¹,

Vanderburg²,

Shallue³

et al. 2019

View full text Add to dashboard Cite

For years, scientists have used data from NASA's Kepler Space Telescope to look for and discover thousands of transiting exoplanets. In its extended K2 mission, Kepler observed stars in various regions of sky all across the ecliptic plane, and therefore in different galactic environments. Astronomers want to learn how the population of exoplanets are different in these different environments. However, this requires an automatic and unbiased way to identify the exoplanets in these regions and rule out false positive signals that mimic transiting planet signals. We present a method for classifying these exoplanet signals using deep learning, a class of machine learning algorithms that have become popular in fields ranging from medical science to linguistics. We modified a neural network previously used to identify exoplanets in the Kepler field to be able to identify exoplanets in different K2 campaigns, which range in galactic environments. We train a convolutional neural network, called AstroNet-K2, to predict whether a given possible exoplanet signal is really caused by an exoplanet or a false positive. AstroNet-K2 is highly successful at classifying exoplanets and false positives, with accuracy of 98% on our test set. It is especially efficient at identifying and culling false positives, but for now, still needs human supervision to create a complete and reliable planet candidate sample. We use AstroNet-K2 to identify and validate two previously unknown exoplanets. Our method is a step towards automatically identifying new exoplanets in K2 data and learning how exoplanet populations depend on their galactic birthplace.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.