The combination of Reinforcement Learning (RL) with deep learning has led to a series of impressive feats, with many believing (deep) RL provides a path towards generally capable agents. However, the success of RL agents is often highly sensitive to design choices in the training process, which may require tedious and error-prone manual tuning. This makes it challenging to use RL for new problems and also limits its full potential. In many other areas of machine learning, AutoML has shown that it is possible to automate such design choices, and AutoML has also yielded promising initial results when applied to RL. However, Automated Reinforcement Learning (AutoRL) involves not only standard applications of AutoML but also includes additional challenges unique to RL, that naturally produce a different set of methods. As such, AutoRL has been emerging as an important area of research in RL, providing promise in a variety of applications from RNA design to playing games, such as Go. Given the diversity of methods and environments considered in RL, much of the research has been conducted in distinct subfields, ranging from meta-learning to evolution. In this survey, we seek to unify the field of AutoRL, provide a common taxonomy, discuss each area in detail and pose open problems of interest to researchers going forward.
We propose a method for meta-learning reinforcement learning algorithms by searching over the space of computational graphs which compute the loss function for a value-based model-free RL agent to optimize. The learned algorithms are domain-agnostic and can generalize to new environments not seen during training. Our method can both learn from scratch and bootstrap off known existing algorithms, like DQN, enabling interpretable modifications which improve performance. Learning from scratch on simple classical control and gridworld tasks, our method rediscovers the temporal-difference (TD) algorithm. Bootstrapped from DQN, we highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games. The analysis of the learned algorithm behavior shows resemblance to recently proposed RL algorithms that address overestimation in value-based methods.
Integrated radar and communications (IRC) technology has become very important for civil and military applications in recent years, and IRC waveform design is a major challenge for IRC development. In this paper, we focus on the IRC waveform design based on the multi-symbol orthogonal frequency division multiplexing (OFDM) technique. In view of the defects resulting from high peak-to-mean envelope power ratios (PMEPRs) and high range sidelobes in IRC systems, an intelligent and effective IRC waveform design method jointly optimized with the PMEPR and peak-to-sidelobe ratio (PSLR) is proposed. Firstly, a flexible tone reservation (TR)-based IRC waveform structure is applied in both temporal and frequency domains, i.e. multi-symbol OFDM waveform. Secondly, the optimization problem considering PMEPR and PSLR and extending them to the Lp-norm form is reformulated. Then, the conjugate gradient of the objective function is analytically derived and the conjugate gradient algorithm (CGA) is presented to simultaneously improve the PMEPR and PSLR. Finally, the simulation results show that the proposed algorithm can efficiently generate IRC waveforms with an excellent PMEPR, PSLR, radar signal-to-noise ratio (SNR), and bit error rate (BER) performance.
The increasing accessibility of unmanned aerial vehicles (UAVs) drives the demand for reliable, easy-to-deploy surveillance systems to consolidate public security. This paper employs passive bistatic radar (PBR) based on a digital audio broadcast (DAB) satellite for UAV monitoring in applications with power density limitations on electromagnetic radiation. An advanced version of the extensive cancellation algorithm (ECA) based on data segmentation and coefficients filtering is designed to improve the efficiency of multipath clutter suppression while retaining robustness, for which the effectiveness is verified by theoretical derivation and simulation. The detectability of small UAVs with DAB satellite-based PBR is validated with experimental results, with which the influence of target altitude and bistatic geometry are also analyzed.
In this paper, the problem of constructing the measurement matrix in compressed sensing is addressed. In compressed sensing, constructing a measurement matrix of good performance and easy hardware implementation is of interest. It has been recently shown that the measurement matrices constructed by Logistic or Tent chaotic sequences satisfy the restricted isometric property (RIP) with a certain probability and are easy to be implemented in the physical electric circuit. However, a large sample distance that means large resources consumption is required to obtain uncorrelated samples from these sequences in the construction. To solve this problem, we propose a method of constructing the measurement matrix by the Chebyshev chaotic sequence. The method effectively reduces the sample distance and the proposed measurement matrix is proved to satisfy the RIP with high probability on the assumption that the sampled elements are statistically independent. Simulation results show that the proposed measurement matrix has comparable reconstruction performance to that of the existing chaotic matrices for compressed sensing.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with đŸ’™ for researchers
Part of the Research Solutions Family.