The mean percentage difference between the empirical MSE and the posterior variance. The mean was calculated over a set of grid locations inside the farm. 4.3 Disk placement covering the farm calculated by DC algorithm. The disks are of radii 3r max which are concentric with disks of radii r max in I. The approximate optimal TSP tour visiting the centers is shown in blue. The lawnmower detours have been omitted to make the figure more legible.. .
This paper studies the problem of Reinforcement Learning (RL) using as few real-world samples as possible. A naive application of RL algorithms can be inefficient in large and continuous state spaces. We present two versions of Multi-Fidelity Reinforcement Learning (MFRL) algorithm that leverage Gaussian Processes (GPs) to learn the optimal policy in a real-world environment. In MFRL framework, an agent uses multiple simulators of the real environment to perform actions. With increasing fidelity in a simulator chain, the number of samples used in successively higher simulators can be reduced. By incorporating GPs in MFRL framework, further reduction in the number of learning samples can be achieved as we move up the simulator chain. We examine the performance of our algorithms with the help of real-world experiments for navigation with a ground robot.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.