A new graph-based evolutionary algorithm called "Genetic Network Programming (GNP)" has been proposed. GNP represents its solutions as graph structures which have some distinguished abilities. For example, GNP can memorize past action sequences in the network flow and make compact structures. However, conventional GNP is based on evolution, i.e., after GNP programs are carried out to some extent, they are evaluated and evolved based on their fitness values, so many trials must be executed again and again. Therefore, GNP with Reinforcement Learning (GNP-RL) which combines evolution-based GNP and RL has been proposed. In this method, because RL is done when an agent is carrying out its task, GNP can search for better solutions every judgment and processing during task execution besides the evolutional operation executed after task execution. The aim of combining evolution and RL is to take advantage of the sophisticated diversified search ability of evolution and the intensified search ability of RL with online learning. Moreover, the size of the Q table could be set at small values. Therefore, the calculation time and memory consumption can be saved.In this paper, in order to extend the ability of GNP-RL and apply to real world problems, we aim to deal with numerical information (ex. 15 degrees, 512 points, etc.), while all the algorithms of GNP already proposed deal with discrete information (ex. right, left, etc.). In the simulations, the proposed method is applied to the controller of the Khepera simulator and it learns wall-following behavior.Wall-following behavior problem requires the robot to move along walls and also requires it to move as fast and straight as possible. The rewards and fitness values used for learning and evolution, respectively are calculated based on the above conditions. Table 1 shows the generalization ability of GNP-RL, evolution-based GNP (standard GNP) and Neural Networks evolved by GA (NN-GA). The average fitness is calculated as follows. First, each method produces programs in a training Fig. 1. Typical track of the robot controlled by GNP-RL in the testing environment environment. Then, the average fitness is calculated in a testing environment using the best programs of each method obtained in the training environment. From Table 1, it is clarified that GNP-RL shows the best fitness and there are significant differences between GNP-RL and the other methods. Fig. 1 shows a typical behavior of the robot controlled by GNP-RL. It also shows the speed of the wheels under several situations. From the figure, the proposed method can learn wall-following behavior well. The robot can move straight along the wall and when the wall becomes close to the robot, it can appropriately avoid colliding with the wall by adjusting the speed of the two wheels.-11 - Paper Genetic Network Programming with Reinforcement Learning and ItsApplication to Making Mobile Robot Behavior MemberA new graph-based evolutionary algorithm called Genetic Network Programming, GNP has been proposed. The solutions of...
A novel optimization method named RasID-GA (an abbreviation of Adaptive Random Search with Intensification and Diversification combined with Genetic Algorithm) is proposed in order to enhance the searching ability of conventional RasID, which is a kind of Random Search with Intensification and Diversification. In this paper, the timing of switching from RasID to GA, or from GA to RasID is also studied. RasID-GA is compared with parallel RasIDs and GA using 23 different objective functions, and it turns out that RasID-GA performs well compared with other methods.
A new graph-based evolutionary algorithm named learning methods is applied to GNP (GNP with Actor-"Genetic Network Programming, GNP" has been already Critic, GNP-AC). The proposed method is applied to the proposed. GNP represents its solutions as graph structures, controller of the Khepera simulator and its performance is which can improve the expression ability and performance. In evaluated. The evolution of the proposed method determines addition, GNP with Reinforcement Learning (GNP-RL) was evalucture ofoGNP,oi.of the opos bethodndes, and proposed a few years ago. Since GNP-RL can do reinforcement the structure of GNP, i.e., the connections between nodes, and learning during task execution in addition to evolution after Actor-Critic efficiently determines node functions, i.e., the task execution, it can search for solutions efficiently. In this speed of the wheels and the parameters of judgment nodes.paper, GNP with Actor-Critic (GNP-AC) which is a new This paper is organized as follows. In the next section, the type of GNP-RL is proposed. Originally, GNP deals with . . . . ' discrete information, but GNP-AC aims to deal with continuous algorithm of the proposed method is described. Section III information. The proposed method is applied to the controller shows the results of the simulations. Section IV is devoted of the Khepera simulator and its performance is evaluated. to conclusions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.