Machines learn to infer stellar parameters just by looking at a large number of spectra

Sedaghat, Nima; Romaniello, M.; Carrick, Jonathan; Pineau, F. X.

doi:10.1093/mnras/staa3540

Cited by 15 publications

(9 citation statements)

References 39 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The variational autoencoder (VAE; Kingma & Welling 2014) is a variant of AE, and is widely used in astronomy (e.g. Portillo et al 2020;Sedaghat et al 2021). Although the architecture of VAE is similar to that of AE, the concept behind is very different, and VAE is based on the variational Bayesian inference (Kingma & Welling 2014).…”

Section: Variational Autoencodermentioning

confidence: 99%

Investigation of stellar magnetic activity using variational autoencoder based on low-resolution spectroscopic survey

Xiang,

Gu,

Cao

2022

Preprint

View full text Add to dashboard Cite

We apply the variational autoencoder (VAE) to the LAMOST-K2 low-resolution spectra to detect the magnetic activity of the stars in the K2 field. After the training on the spectra of the selected inactive stars, the VAE model can efficiently generate the synthetic reference templates needed by the spectral subtraction procedure, without knowing any stellar parameters. Then we detect the peculiar spectral features, such as chromospheric emissions, strong nebular emissions and lithium absorptions, in our sample. We measure the emissions of the chromospheric activity indicators, H𝛼 and Ca infrared triplet (IRT) lines, to quantify the stellar magnetic activity. The excess emissions of H𝛼 and Ca IRT lines of the active stars are correlated well to the rotational periods and the amplitudes of light curves derived from the K2 photometry. We degrade the LAMOST spectra to simulate the slitless spectra of the China Space Station Telescope (CSST) and apply the VAE to the simulated data. For cool active stars, we reveal a good agreement between the equivalent widths (EWs) of H𝛼 line derived from the spectra with two resolutions. The result indicates the ability of identifying the magnetically active stars in the future CSST survey, which will deliver an unprecedented large database of low-resolution spectra as well as simultaneous multi-band photometry of stars.

show abstract

Section: Variational Autoencodermentioning

confidence: 99%

Investigation of stellar magnetic activity using variational autoencoder based on low-resolution spectroscopic survey

Xiang,

Gu,

Cao

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…for backpropagation when training DL models), such neural-network-based estimators do not necessarily return an accurate estimate of equation (1), are heavily dependent on the training hyperparameters, and have been shown to suffer from a poor variance-bias tradeoff [75]. The use of MI estimates for interpreting deep representation learning has recently been investigated as well [19,32,79,80]; however, exploiting MI to interpret deep representation learning requires a robust density estimate of the joint probability distribution between latent variables and relevant physical parameters, and the uncertainties on the MI estimate to be quantified, ensuring that any trends in MI are statistically significant.…”

Section: Introductionmentioning

confidence: 99%

A robust estimator of mutual information for deep learning interpretability

Piras¹,

Peiris²,

Pontzen³

et al. 2023

Mach. Learn.: Sci. Technol.

View full text Add to dashboard Cite

We develop the use of mutual information (MI), a well-established metric in information theory, to interpret the inner workings of deep learning models. To accurately estimate MI from a finite number of samples, we present GMM-MI (pronounced “Jimmie”), an algorithm based on Gaussian mixture models that can be applied to both discrete and continuous settings. GMM-MI is computationally efficient, robust to the choice of hyperparameters and provides the uncertainty on the MI estimate due to the finite sample size. We extensively validate GMM-MI on toy data for which the ground truth MI is known, comparing its performance against established mutual information estimators. We then demonstrate the use of our MI estimator in the context of representation learning, working with synthetic data and physical datasets describing highly non-linear processes. We train deep learning models to encode high-dimensional data within a meaningful compressed (latent) representation, and use GMM-MI to quantify both the level of disentanglement between the latent variables, and their association with relevant physical quantities, thus unlocking the interpretability of the latent representation. Upon acceptance of this work, we will link a publicly available GitHub repository which contains GMM-MI and the code to reproduce the results of this paper.

show abstract

“…Most of these methods require prior knowledge of the system of interest, for example a priori knowledge of the relevant variables or the underlying dimensionality. Recently, Sedaghat et al [33] adopted a similar architecture to SciNet in an unsupervised setting, where a neural network is trained to find a lowdimensional representation of stellar spectra. Similar to our work, they used mutual information for interpretability; however, their use of mutual information was limited to identifying potential correlations between the latent representation and previously known parameters.…”

Section: Introductionmentioning

confidence: 99%

Discovering the building blocks of dark matter halo density profiles with neural networks

Lucie-Smith,

Peiris,

Pontzen

et al. 2022

Preprint

View full text Add to dashboard Cite

The density profiles of dark matter halos are typically modeled using empirical formulae fitted to the density profiles of relaxed halo populations. We present a neural network model that is trained to learn the mapping from the raw density field containing each halo to the dark matter density profile. We show that the model recovers the widely-used Navarro-Frenk-White (NFW) profile out to the virial radius, and can additionally describe the variability in the outer profile of the halos. The neural network architecture consists of a supervised encoderdecoder framework, which first compresses the density inputs into a low-dimensional latent representation, and then outputs ρ(r) for any desired value of radius r. The latent representation contains all the information used by the model to predict the density profiles. This allows us to interpret the latent representation by quantifying the mutual information between the representation and the halos' ground-truth density profiles. A two-dimensional representation is sufficient to accurately model the density profiles up to the virial radius; however, a threedimensional representation is required to describe the outer profiles beyond the virial radius. The additional dimension in the representation contains information about the infalling material in the outer profiles of dark matter halos, thus discovering the splashback boundary of halos without prior knowledge of the halos' dynamical history.

show abstract

Machines learn to infer stellar parameters just by looking at a large number of spectra

Cited by 15 publications

References 39 publications

Investigation of stellar magnetic activity using variational autoencoder based on low-resolution spectroscopic survey

Investigation of stellar magnetic activity using variational autoencoder based on low-resolution spectroscopic survey

A robust estimator of mutual information for deep learning interpretability

Discovering the building blocks of dark matter halo density profiles with neural networks

Contact Info

Product

Resources

About