Self-optimizing adaptive optics control with reinforcement learning for high-contrast imaging

Landman, Rico; Haffert, Sebastiaan Y.; Radhakrishnan, Vikram M.; Keller, Christoph U.

doi:10.1117/1.jatis.7.3.039002

Cited by 24 publications

(19 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…An up-and-coming field of research aimed at improving AO control methods is the application of fully data-driven control methods, where the control voltages are separately added to the learned control model (Nousiainen et al 2021;Landman et al 2020Landman et al , 2021Haffert et al 2021a,b;Pou et al 2022). A significant benefit of fully data-driven control in closed-loop is that it does not require an estimate of the system's open-loop temporal evolution and that it is, therefore, insensitive to pseudo-open-loop reconstruction errors, such as the optical gain effect (Haffert et al 2021a).…”

Section: Introductionmentioning

confidence: 99%

“…Previous work in RL-based adaptive optics control has focused on either controlling DM modes using model-free methods that learn a policy π θ : s t → a t parameterized by θ that maps states s t (or observations) into actions a t directly (Landman et al 2020(Landman et al , 2021Pou et al 2022), or using model-based methods that employ a planning step to compute actions (Nousiainen et al 2021). The model-free methods have the advantage of being fast to evaluate, as the learned policies are often neural networks that support sub millisecond inference.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Toward on-sky adaptive optics control using reinforcement learning

Nousiainen

Rajani

Kasper

et al. 2022

A&A

View full text Add to dashboard Cite

Context. The direct imaging of potentially habitable exoplanets is one prime science case for the next generation of high contrast imaging instruments on ground-based, extremely large telescopes. To reach this demanding science goal, the instruments are equipped with eXtreme Adaptive Optics (XAO) systems which will control thousands of actuators at a framerate of kilohertz to several kilohertz. Most of the habitable exoplanets are located at small angular separations from their host stars, where the current control laws of XAO systems leave strong residuals. Aims. Current AO control strategies such as static matrix-based wavefront reconstruction and integrator control suffer from a temporal delay error and are sensitive to mis-registration, that is, to dynamic variations of the control system geometry. We aim to produce control methods that cope with these limitations, provide a significantly improved AO correction, and, therefore, reduce the residual flux in the coronagraphic point spread function (PSF). Methods. We extend previous work in reinforcement learning for AO. The improved method, called the Policy Optimization for Adaptive Optics (PO4AO), learns a dynamics model and optimizes a control neural network, called a policy. We introduce the method and study it through numerical simulations of XAO with Pyramid wavefront sensor (PWFS) for the 8-m and 40-m telescope aperture cases. We further implemented PO4AO and carried out experiments in a laboratory environment using Magellan Adaptive Optics eXtreme system (MagAO-X) at the Steward laboratory.Results. PO4AO provides the desired performance by improving the coronagraphic contrast in numerical simulations by factors of 3-5 within the control region of deformable mirror and PWFS, both in simulation and in the laboratory. The presented method is also quick to train, that is, on timescales of typically 5-10 seconds, and the inference time is sufficiently small (< ms) to be used in real-time control for XAO with currently available hardware even for extremely large telescopes.

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Toward on-sky adaptive optics control using reinforcement learning

Nousiainen

Rajani

Kasper

et al. 2022

A&A

View full text Add to dashboard Cite

show abstract

“…Previous work in RL-based adaptive optics control has focused on either controlling DM modes using model-free methods that learn a policy π θ : s t → a t parameterized by θ that maps observations/states s t into actions a t directly (Landman et al 2020(Landman et al , 2021Pou et al 2022), or using model-based methods that employ a planning step to compute actions (Nousiainen et al 2021). The model-free methods have the advantage of being fast to evaluate, as the learned policies are often neural networks that support sub-millisecond inference.…”

Section: Introductionmentioning

confidence: 99%

Towards on-sky adaptive optics control using reinforcement learning

Nousiainen,

Rajani,

Kasper

et al. 2022

Preprint

View full text Add to dashboard Cite

Context. The direct imaging of potentially habitable Exoplanets is one prime science case for the next generation of high contrast imaging instruments on ground-based extremely large telescopes. To reach this demanding science goal, the instruments are equipped with eXtreme Adaptive Optics (XAO) systems which will control thousands of actuators at a framerate of kilohertz to several kilohertz. Most of the habitable exoplanets are located at small angular separations from their host stars, where the current control laws of XAO systems leave strong residuals. Aims. Current AO control strategies like static matrix-based wavefront reconstruction and integrator control suffer from temporal delay error and are sensitive to mis-registration, i.e., to dynamic variations of the control system geometry. We aim to produce control methods that cope with these limitations, provide a significantly improved AO correction, and, therefore, reduce the residual flux in the coronagraphic point spread function (PSF). Methods. We extend previous work in Reinforcement Learning (RL) for AO. The improved method, called PO4AO, learns a dynamics model and optimizes a control neural network, called a policy. We introduce the method and study it through numerical simulations of XAO with Pyramid wavefront sensing for the 8-m and 40-m telescope aperture cases. We further implemented PO4AO and carried out experiments in a laboratory environment using Magellan Adaptive Optics eXtreme system (MagAO-X) at the Steward laboratory.Results. PO4AO provides the desired performance by improving the coronagraphic contrast in numerical simulations by factors 3-5 within the control region of DM and Pyramid WFS, both in simulation and in the laboratory. The presented method is also quick to train, i.e., on timescales of typically 5-10 seconds, and the inference time is sufficiently small (< ms) to be used in real-time control for XAO with currently available hardware even for extremely large telescopes.

show abstract

“…[20][21][22] A promising area of research for mitigating these nonlinearities is the use of neural networks for learning a nonlinear mapping between wavefront sensor measurements and wavefront, [23][24][25][26] or for nonlinear control. [27][28][29][30] Furthermore, the similarities between optical systems and Neural Networks have lead to studies exploiting automatic differentiation algorithms, initially developed for training NNs, for optimizing elements in the optical system 31,32 or more efficient wavefront control. 33,34 Automatic differentiation allows us to obtain gradients with respect to the free design parameters, even for complex optical systems with multiple elements and planes.…”

Section: Introductionmentioning

confidence: 99%

Joint optimization of wavefront sensing and reconstruction with automatic differentiation

Landman

Keller

Por

et al. 2022

Adaptive Optics Systems VIII

Self Cite

View full text Add to dashboard Cite

High-contrast imaging instruments need extreme wavefront control to directly image exoplanets. This requires highly sensitive wavefront sensors which optimally make use of the available photons to sense the wavefront. Here, we propose to numerically optimize Fourier-filtering wavefront sensors using automatic differentiation. First, we optimize the sensitivity of the wavefront sensor for different apertures and wavefront distributions. We find sensors that are more sensitive than currently used sensors and close to the theoretical limit, under the assumption of monochromatic light. Subsequently, we directly minimize the residual wavefront error by jointly optimizing the sensing and reconstruction. This is done by connecting differentiable models of the wavefront sensor and reconstructor and alternatingly improving them using a gradient-based optimizer. We also allow for nonlinearities in the wavefront reconstruction using Convolutional Neural Networks, which extends the design space of the wavefront sensor. Our results show that optimization can lead to wavefront sensors that have improved performance over currently used wavefront sensors. The proposed approach is flexible, and can in principle be used for any wavefront sensor architecture with free design parameters.

show abstract

Self-optimizing adaptive optics control with reinforcement learning for high-contrast imaging

Cited by 24 publications

References 0 publications

Toward on-sky adaptive optics control using reinforcement learning

Toward on-sky adaptive optics control using reinforcement learning

Towards on-sky adaptive optics control using reinforcement learning

Joint optimization of wavefront sensing and reconstruction with automatic differentiation

Contact Info

Product

Resources

About