In this work, we propose a method of control and anti-control of chaos based on the moving largest Lyapunov exponent using reinforcement learning. In this method, we design a reward function for the reinforcement learning according to the moving largest Lyapunov exponent, which is similar to the moving average but computes the corresponding largest Lyapunov exponent using a recently updated time series with a fixed, short length. We adopt the density peaks-based clustering algorithm to determine a linear region of the average divergence index so that we can obtain the largest Lyapunov exponent of the small data set by fitting the slope of the linear region. We show that the proposed method is fast and easy to implement through controlling and anti-controlling typical systems such as the Henon map and Lorenz system.
We propose a new method to enhance stochastic resonance based on reinforcement learning , which does not require a priori knowledge of the underlying dynamics. The reward function of the reinforcement learning algorithm is determined by introducing a moving signal-to-noise ratio, which promptly quantifies the ratio of signal power to noise power by updating time series with a fixed length. To maximize the cumulative reward, the reward function can guide the actions to enhance the signal-to-noise ratio of systems as largely as possible with the help of the moving signal-to-noise ratio. Since the occurrence of the spike of excitable systems, which requires the systems to evolve for some time, should be considered an important component for the definition of the signal-to-noise ratio, the reward corresponding to the current moment cannot be obtained immediately and this usually results in a delayed reward. The delayed reward may cause the policy of the reinforcement learning algorithm to update with an incompatible reward, which affects the stability and convergence of the algorithm. To overcome this challenge, we devise a technique of double Q-tables, where one Q-table is used to generate actions, and the other is used to correct deviations. In this way, the policy can be updated with a corresponding reward, which ameliorates the stability of the algorithm and accelerates its convergence speed. We show with two illustrative examples, the Fitzhugh–Nagumo and Hindmarsh–Rose models, stochastic resonance is significantly enhanced by the proposed method for two typical types of stochastic resonances, classical stochastic resonance with a weak signal and coherent resonance without weak signals, respectively. We also show the robustness of the proposed method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.