We seek to remove foreground contaminants from 21 cm intensity mapping observations. We demonstrate that a deep convolutional neural network (CNN) with a UNet architecture and three-dimensional convolutions, trained on simulated observations, can effectively separate frequency and spatial patterns of the cosmic neutral hydrogen (HI) signal from foregrounds in the presence of noise. Cleaned maps recover cosmological clustering amplitude and phase within 20% at all relevant angular scales and frequencies. This amounts to a reduction in prediction variance of over an order of magnitude across angular scales, and improved accuracy for intermediate radial scales (0.025 < k∥ < 0.075 h Mpc-1) compared to standard Principal Component Analysis (PCA) methods. We estimate epistemic confidence intervals for the network's prediction by training an ensemble of UNets. Our approach demonstrates the feasibility of analyzing 21 cm intensity maps, as opposed to derived summary statistics, for upcoming radio experiments, as long as the simulated foreground model is sufficiently realistic. We provide the code used for this analysis on GitHub , as well as a browser-based tutorial for the experiment and UNet model via the accompanying Colab notebook .
We present a comparison of simulation-based inference to full, field-based analytical inference in cosmological data analysis. To do so, we explore parameter inference for two cases where the information content is calculable analytically: Gaussian random fields whose covariance depends on parameters through the power spectrum; and correlated lognormal fields with cosmological power spectra. We compare two inference techniques: i) explicit field-level inference using the known likelihood and ii) implicit likelihood inference with maximally informative summary statistics compressed via Information Maximising Neural Networks (IMNNs). We find that a) summaries obtained from convolutional neural network compression do not lose information and therefore saturate the known field information content, both for the Gaussian covariance and the lognormal cases, b) simulation-based inference using these maximally informative nonlinear summaries recovers nearly losslessly the exact posteriors of field-level inference, bypassing the need to evaluate expensive likelihoods or invert covariance matrices, and c) even for this simple example, implicit, simulation-based likelihood incurs a much smaller computational cost than inference with an explicit likelihood. This work uses a new IMNN implementation in Jax that can take advantage of fully-differentiable simulation and inference pipeline. We also demonstrate that a single retraining of the IMNN summaries effectively achieves the theoretically maximal information, enhancing the robustness to the choice of fiducial model where the IMNN is trained.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.