We consider a class of generalized planning problems based on the idea of quantifying over sets of similar objects. We show how we can adapt fully observable nondeterministic planning techniques to produce generalized solutions that are easy to instantiate over particular problem instances. We also describe how we can reformulate a classical planning problem into a quantified one. The reformulation allows us to solve the original planning task without grounding every action with respect to all objects in the problem, and a single solution can be applied to a possibly infinite set of related classical planning tasks. We report experimental results that show our approach is a practical and promising technique for solving an interesting class of problems.
The real-world application of planning techniques often requires models with numeric fluents. However, these fluents are not directly supported by most planners and heuristics. We describe a family of planning algorithms that takes a numeric planning problem and produces an abstracted representation that can be solved using any classical planner. The resulting abstract plan is generalized into a policy and then used to guide the search in the original numeric domain. We prove that our approach is sound, and evaluate it on a set of standard benchmarks. Experiments demonstrate competitive performance when compared to other well-known algorithms for numeric planning, and a significant performance improvement in certain domains.
Reinforcement learning (RL) agents seek to maximize the cumulative reward obtained when interacting with their environment. Users define tasks or goals for RL agents by designing specialized reward functions such that maximization aligns with task satisfaction. This work explores the use of high-level symbolic action models as a framework for defining final-state goal tasks and automatically producing their corresponding reward functions. We also show how automated planning can be used to synthesize high-level plans that can guide hierarchical RL (HRL) techniques towards efficiently learning adequate policies. We provide a formal characterization of taskable RL environments and describe sufficient conditions that guarantee we can satisfy various notions of optimality (e.g., minimize total cost, maximize probability of reaching the goal). In addition, we do an empirical evaluation that shows that our approach converges to near-optimal solutions faster than standard RL and HRL methods and that it provides an effective framework for transferring learned skills across multiple tasks in a given environment.
Many applications, ranging from video games to dynamic robotics, require solving single-agent, deterministic search problems in partially known environments under very tight time constraints. Real-Time Heuristic Search (RTHS) algorithms are specifically designed for those applications. As a subroutine, most of them invoke a standard, but bounded, search algorithm that searches for the goal. In this paper we present FRIT, a simple approach for single-agent deterministic search problems under tight constraints and partially known environments that unlike traditional RTHS does not search for the goal but rather searches for a path that connects the current state with a so-called ideal tree T . When the agent observes that an arc in the tree cannot be traversed in the actual environment, it removes such an arc from T and then carries out a reconnection search whose objective is to find a path between the current state and any node in T . The reconnection search is done using an algorithm that is passed as a parameter to FRIT. If such a parameter is an RTHS algorithm, then the resulting algorithm can be an RTHS algorithm. We show, in addition, that FRIT may be fed with a (bounded) complete blind-search algorithm. We evaluate our approach over grid pathfinding benchmarks including game maps and mazes. Our results show that FRIT, used with RTAA*, a standard RTHS algorithm, outperforms RTAA* significantly; by one order of magnitude under tight time constraints. In addition, FRIT(daRTAA*) substantially outperforms daRTAA*, a state-of-the-art RTHS algorithm, usually obtaining solutions 50% cheaper on average when performing the same search effort. Finally, FRIT(BFS), i.e., FRIT using breadth-first-search, obtains best-quality solutions when time is limited compared to Adaptive A* and Repeated A*. Finally we show that Bug2, a pathfinding-specific navigation algorithm, outperforms FRIT(BFS) when planning time is extremely limited, but when given more time, the situation reverses.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.