Reinforcement Programming (RP) is a new technique for automatically generating a computer program using reinforcement learning methods. This paper describes how RP learned to generate code for three binary addition problems: simulate a full adder circuit, increment a binary number, and add two binary numbers. Each problem is presented as an extension of the one previous to it, which provides an introduction to the practical application of RP. Each solution uses a dynamic, episodic form of delayed Q-Learning algorithm. "Dynamic" means that grows the policy during learning, and prunes it before the policy is translated to source code. This is different from Q-Learning models that use fixed-size tables or neural net function approximators to store q-values associated with (state,action) pairs. The states, actions, rewards, other parameters, and results of experiments are presented for each of the three problems.
Reinforcement Programming (RP) is a new approach to automatically generating algorithms that uses reinforcement learning techniques. This paper introduces the RP approach and demonstrates its use to generate a generalized, in-place, iterative sort algorithm. The RP approach improves on earlier results that use genetic programming (GP). The resulting algorithm is a novel algorithm that is more efficient than comparable sorting routines. RP learns the sort in fewer iterations than GP and with fewer resources. Experiments establish interesting empirical bounds on learning the sort algorithm: A list of size 4 is sufficient to learn the generalized sort algorithm. The training set only requires one element and learning took less than 200,000 iterations. Additionally RP was used to generate three binary addition algorithms: a full adder, a binary incrementer, and a binary adder.
Currently, the system of locks and dams within the United States operate where each system has a different component and needs different parts to complete the routine maintenance checks and procedures. Having unique components and parts for each lock and dam system drastically increases the costs required for the United States Army Corps of Engineers (USACE) to operate and maintain these locks and dams. One way to reduce these costs is to work towards and recommend standardized components for a lock and dam system. This process, especially for construction projects, is vital because it allows for simplification in the build and production stages of a project as well as life cycle maintenance. Understanding hydraulic design for the inflow and outflow of a lock system was an important consideration for this design project. Reducing hawser forces while maximizing the efficiency of the filling and emptying process is the overall goal for the design. To minimize hawser forces, mitigating the effects of hydrodynamic and hydrostatic forces is essential. This research also strives to gain additional understanding of the dynamic, turbulent nature of water in a lock and dam system. In the Emsworth Lock and Dam system, the top of rock for the riverbed is significantly higher than normal presenting unique challenges for modeling and simulation, as well as physical model construction. Critical to the design of a physical model is the determination of an adequate scaling factor that will not significantly affect the natural hydraulic processes within the system. As such, it is essential that appropriate theories are applied to remain consistent with proven methods of hydraulic scaling. Before selecting a scaling ratio, determining space limitations and a conceptual design of the model was necessary. This assisted in visualizing the model in the available spaces to ensure the design and manufacturing plan was realistic. The model contains three components: a main lock chamber, a higher elevation water reservoir, and a lower elevation water reservoir. The component that is most controlling to the design is the main lock chamber; this component cannot be altered in any way to meet the requirements of the floor space because any modifications would affect the results of the hawser force testing, and the model would not appropriately match reality. The physical model will be verified using the Froude equation — an equation that drives performance of models that are dependent on gravity. As such, when conducting any inflow or outflow of the water in the system, it is essential that the velocity is controlled such that the Froude value is consistent with that of the actual Emsworth Lock and Dam. The model must match a Froude number of 0.052 to effectively represent reality.
Reinforcement Programming (RP) is a new approach to automatically generating algorithms, that uses reinforcement learning techniques. This paper describes the RP approach and gives results of experiments using RP to generate a generalized, in-place, iterative sort algorithm. The RP approach improves on earlier results that that use genetic programming (GP). The resulting algorithm is a novel algorithm that is more efficient than comparable sorting routines. RP learns the sort in fewer iterations than GP and with fewer resources. Results establish interesting empirical bounds on learning the sort algorithm: A list of size 4 is sufficient to learn the generalized sort algorithm. The training set only requires one element and learning took less than 200,000 iterations. RP has also been used to generate three binary addition algorithms: a full adder, a binary incrementer, and a binary adder.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.