The family of natural evolution strategies (NES) offers a principled approach to real-valued evolutionary optimization by following the natural gradient of the expected fitness. Like the well-known CMA-ES, the most competitive algorithm in the field, NES comes with important invariance properties. In this paper, we introduce a number of elegant and efficient improvements of the basic NES algorithm. First, we propose to parameterize the positive definite covariance matrix using the exponential map, which allows the covariance matrix to be updated in a vector space. This new technique makes the algorithm completely invariant under linear transformations of the underlying search space, which was previously achieved only in the limit of small step sizes. Second, we compute all updates in the natural coordinate system, such that the natural gradient coincides with the vanilla gradient. This way we avoid the computation of the inverse Fisher information matrix, which is the main computational bottleneck of the original NES algorithm. Our new algorithm, exponential NES (xNES), is significantly simpler than its predecessors. We show that the various update rules in CMA-ES are closely related to the natural gradient updates of xNES. However, xNES is more principled than CMA-ES, as all the update rules needed for covariance matrix adaptation are derived from a single principle. We empirically assess the performance of the new algorithm on standard benchmark functions.
Abstract. To maximize its success, an AGI typically needs to explore its initially unknown world. Is there an optimal way of doing so? Here we derive an affirmative answer for a broad class of environments.
Efficient Natural Evolution Strategies (eNES) is a novel alternative to conventional evolutionary algorithms, using the natural gradient to adapt the mutation distribution. Unlike previous methods based on natural gradients, eNES uses a fast algorithm to calculate the inverse of the exact Fisher information matrix, thus increasing both robustness and performance of its evolution gradient estimation, even in higher dimensions. Additional novel aspects of eNES include optimal fitness baselines and importance mixing (a procedure for updating the population with very few fitness evaluations). The algorithm yields competitive results on both unimodal and multimodal benchmarks.
This paper discusses parameter-based exploration methods for reinforcement learning. Parameter-based methods perturb parameters of a general function approximator directly, rather than adding noise to the resulting actions. Parameter-based exploration unifies reinforcement learning and black-box optimization, and has several advantages over action perturbation. We review two recent parameter-exploring algorithms: Natural Evolution Strategies and Policy Gradients with Parameter-Based Exploration. Both outperform state-of-the-art algorithms in several complex high-dimensional tasks commonly found in robot control. Furthermore, we describe how a novel exploration method, State-Dependent Exploration, can modify existing algorithms to mimic exploration in parameter space.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.