In this paper, a new training paradigm is proposed for deep reinforcement learning using self-paced prioritized curriculum learning with coverage penalty. The proposed deep curriculum reinforcement learning (DCRL) takes the most advantage of experience replay by adaptively selecting appropriate transitions from replay memory based on the complexity of each transition. The criteria of complexity in DCRL consist of self-paced priority as well as coverage penalty. The self-paced priority reflects the relationship between the temporal-difference error and the difficulty of the current curriculum for sample efficiency. The coverage penalty is taken into account for sample diversity. With comparison to deep Q network (DQN) and prioritized experience replay (PER) methods, the DCRL algorithm is evaluated on Atari 2600 games, and the experimental results show that DCRL outperforms DQN and PER on most of these games. More results further show that the proposed curriculum training paradigm of DCRL is also applicable and effective for other memory-based deep reinforcement learning approaches, such as double DQN and dueling network. All the experimental results demonstrate that DCRL can achieve improved training efficiency and robustness for deep reinforcement learning.
Robust control design for quantum systems has been recognized as a key task in quantum information technology, molecular chemistry and atomic physics. In this paper, an improved differential evolution algorithm of msMS DE is proposed to search robust fields for various quantum control problems. In msMS DE, multiple samples are used for fitness evaluation and a mixed strategy is employed for mutation operation. In particular, the msMS DE algorithm is applied to the control problem of open inhomogeneous quantum ensembles and the consensus problem of a quantum network with uncertainties. Numerical results are presented to demonstrate the excellent performance of the improved DE algorithm for these two classes of quantum robust control problems. Furthermore, msMS DE is experimentally implemented on femtosecond laser control systems to generate good signals of twophoton absorbtion and control fragmentation of halomethane molecules CH 2 BrI. Experimental results demonstrate excellent performance of msMS DE in searching effective femtosecond laser pulses for various tasks.
A high-k erbium oxide thin film was grown on silicon substrate by reactive rf sputtering. It is found that the capacitance value of Er2O3 gate dielectric with TaN metal gate annealed at 700°C is higher compared to other annealing temperature and exhibits a lower hysteresis voltage as well as interface trap density in C-V curves. They also show negligible charge trapping under high constant voltage stress. This phenomenon is attributed to a rather well-crystallized Er2O3 and the decrease of the interfacial layer and Er silicate thickness observed by x-ray diffraction and x-ray photoelectron spectroscopy, respectively.
The key approaches for machine learning, particularly learning in unknown probabilistic environments, are new representations and computation mechanisms. In this paper, a novel quantum reinforcement learning (QRL) method is proposed by combining quantum theory and reinforcement learning (RL). Inspired by the state superposition principle and quantum parallelism, a framework of a value-updating algorithm is introduced. The state (action) in traditional RL is identified as the eigen state (eigen action) in QRL. The state (action) set can be represented with a quantum superposition state, and the eigen state (eigen action) can be obtained by randomly observing the simulated quantum state according to the collapse postulate of quantum measurement. The probability of the eigen action is determined by the probability amplitude, which is updated in parallel according to rewards. Some related characteristics of QRL such as convergence, optimality, and balancing between exploration and exploitation are also analyzed, which shows that this approach makes a good tradeoff between exploration and exploitation using the probability amplitude and can speedup learning through the quantum parallelism. To evaluate the performance and practicability of QRL, several simulated experiments are given, and the results demonstrate the effectiveness and superiority of the QRL algorithm for some complex problems. This paper is also an effective exploration on the application of quantum computation to artificial intelligence.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.