Adaptive feedback schemes are promising for quantum-enhanced measurements yet are complicated to design. Machine learning can autonomously generate algorithms in a classical setting. Here we adapt machine learning for quantum information and use our framework to generate autonomous adaptive feedback schemes for quantum measurement. In particular, our approach replaces guesswork in quantum measurement by a logical, fully automatic, programable routine. We show that our method yields schemes that outperform the best known adaptive scheme for interferometric phase estimation.
Quantum-enhanced metrology infers an unknown quantity with accuracy beyond the standard quantum limit (SQL). Feedback-based metrological techniques are promising for beating the SQL but devising the feedback procedures is difficult and inefficient. Here we introduce an efficient selflearning swarm-intelligence algorithm for devising feedback-based quantum metrological procedures. Our algorithm can be trained with simulated or real-world trials and accommodates experimental imperfections, losses, and decoherence.Precise metrology underpins modern science and engineering. However, the 'standard quantum limit' (SQL) restricts achievable precision, beyond which measurement must be treated on a quantum level. Quantum-enhanced metrology (QEM) aims to beat the SQL by exploiting entangled or squeezed input states and a sophisticated detection strategy [1][2][3]. Feedback-based QEM is most effective as accumulated measurement data are exploited to maximize information gain in subsequent measurements, but finding an optimal QEM policy for a given measurement device is computationally intractable even for pure input states, unitary evolution U , and projective measurements. Typically, policies have been devised by clever guessing [4,5] or brute-force numerical optimization [5]. Recently we introduced swarm-intelligence reinforcement learning to devise optimal policies for measuring an interferometric phase shift [6]. Our algorithm is space efficient; i.e. the memory requirement is a polynomial function of the number of times N that U is effected, in contrast to the exponentially expensive brute-force algorithm. Although our result demonstrated the power of reinforcement learning, our algorithm requires a runtime that is exponential in N and a perfect interferometer, thereby effectively restricting its applicability to proofs of principle. Here we report a space-and timeefficient algorithm (based on new heuristics) for devising QEM policies. Our algorithm works for noisy evolution and loss, thus making reinforcement learning viable for autonomous design of feedback-based QEM in a realworld setting.We restrict our focus to single-parameter QEM. Interferometric phase estimation is the canonical quantum metrology problem and is applicable to measurements of time, displacements, and imaging. Therefore, we develop and benchmark our algorithm for autonomous policy design in this context. To beat the SQL, we employ an entangled sequence of N input photons, feedback control, and direct measurements of the interferometer output. For adaptive phase estimation, the interferometer processes one photon at a time. Each input photon can be in two modes, labeled { 0⟩, 1⟩}, corresponding to the interferometer's two paths. Thus, a time-ordered sequence of N photons implements an N -qubit state.We assume that the interferometric transformation Figure 1. Adaptive feedback scheme for estimating an interferometric phase ϕ. The input state Ψ N ⟩ is fed into the unital quantum channel C one qubit at a time and the output qubit is measured or lost....
Fuzzy controllers are efficient and interpretable system controllers for continuous state and action spaces. To date, such controllers have been constructed manually or trained automatically either using expert-generated problem-specific cost functions or incorporating detailed knowledge about the optimal control strategy. Both requirements for automatic training processes are not found in most real-world reinforcement learning (RL) problems. In such applications, online learning is often prohibited for safety reasons because it requires exploration of the problem's dynamics during policy training. We introduce a fuzzy particle swarm reinforcement learning (FPSRL) approach that can construct fuzzy RL policies solely by training parameters on world models that simulate real system dynamics. These world models are created by employing an autonomous machine learning technique that uses previously generated transition samples of a real system. To the best of our knowledge, this approach is the first to relate self-organizing fuzzy controllers to model-based batch RL. FPSRL is intended to solve problems in domains where online learning is prohibited, system dynamics are relatively easy to model from previously generated default policy transition samples, and it is expected that a relatively easily interpretable control policy exists. The efficiency of the proposed approach with problems from such domains is demonstrated using three standard RL benchmarks, i.e., mountain car, cart-pole balancing, and cart-pole swing-up. Our experimental results demonstrate high-performing, interpretable fuzzy policies.
Abstract-In the research area of reinforcement learning (RL), frequently novel and promising methods are developed and introduced to the RL community. However, although many researchers are keen to apply their methods on real-world problems, implementing such methods in real industry environments often is a frustrating and tedious process. Generally, academic research groups have only limited access to real industrial data and applications. For this reason, new methods are usually developed, evaluated and compared by using artificial software benchmarks. On one hand, these benchmarks are designed to provide interpretable RL training scenarios and detailed insight into the learning process of the method on hand. On the other hand, they usually do not share much similarity with industrial real-world applications. For this reason we used our industry experience to design a benchmark which bridges the gap between freely available, documented, and motivated artificial benchmarks and properties of real industrial problems. The resulting industrial benchmark (IB) has been made publicly available to the RL community by publishing its Java and Python code, including an OpenAI Gym wrapper, on Github. In this paper we motivate and describe in detail the IB's dynamics and identify prototypic experimental settings that capture common situations in real-world industry control problems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.