A Model-Free Approach to Meta-Level Control of Anytime Algorithms

Svegliato, Justin; Sharma, Prakhar; Zilberstein, Shlomo

doi:10.1109/icra40945.2020.9196898

Cited by 10 publications

(4 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, it is often not feasible to calculate the cost of the optimal solution for complex problems. Like earlier work (Hansen and Zilberstein 2001;Svegliato, Wray, and Zilberstein 2018;Svegliato, Sharma, and Zilberstein 2020), we estimate solution quality as the ratio, q = h(s 0 )/ζ, with h(s 0 ) as the h-value of the initial state s 0 and ζ as the cost of the final solution.…”

Section: Methodsmentioning

confidence: 99%

On the Benefits of Randomly Adjusting Anytime Weighted A*

Bhatia

Svegliato

Zilberstein

2021

SOCS

View full text Add to dashboard Cite

Anytime Weighted A*---an anytime heuristic search algorithm that uses a weight to scale the heuristic value of each node in the open list---has proven to be an effective way to manage the trade-off between solution quality and computation time in heuristic search. Finding the best weight, however, is challenging because it depends on not only the characteristics of the domain and the details of the instance at hand, but also the available computation time. We propose a randomized version of this algorithm, called Randomized Weighted A*, that randomly adjusts its weight at runtime and show a counterintuitive phenomenon: RWA* generally performs as well or better than AWA* with the best static weight on a range of benchmark problems. The result is a simple algorithm that is easy to implement and performs consistently well without any offline experimentation or parameter tuning.

show abstract

Section: Methodsmentioning

confidence: 99%

On the Benefits of Randomly Adjusting Anytime Weighted A*

Bhatia

Svegliato

Zilberstein

2021

SOCS

View full text Add to dashboard Cite

show abstract

“…While fixed allocation is effective given negligible uncertainty in the performance of the anytime algorithm, there is often substantial uncertainty in real-time planning (Paul et al 1991). Hence, a more sophisticated approach, namely monitoring and control, tracks the performance of the algorithm and estimates a stopping point at runtime periodically (Horvitz 1990;Zilberstein and Russell 1995;Hansen and Zilberstein 2001;Lin et al 2015;Svegliato, Wray, and Zilberstein 2018;Svegliato, Sharma, and Zilberstein 2020). Our approach not only determines the stopping point but also tunes the hyperparameters of an anytime algorithm at runtime.…”

Section: Related Workmentioning

confidence: 99%

Tuning the Hyperparameters of Anytime Planning: A Metareasoning Approach with Deep Reinforcement Learning

Bhatia

Svegliato

Nashed

et al. 2022

ICAPS

Self Cite

View full text Add to dashboard Cite

Anytime planning algorithms often have hyperparameters that can be tuned at runtime to optimize their performance. While work on metareasoning has focused on when to interrupt an anytime planner and act on the current plan, the scope of metareasoning can be expanded to tuning the hyperparameters of the anytime planner at runtime. This paper introduces a general, decision-theoretic metareasoning approach that optimizes both the stopping point and hyperparameters of anytime planning. We begin by proposing a generalization of the standard meta-level control problem for anytime algorithms. We then offer a meta-level control technique that monitors and controls an anytime algorithm using deep reinforcement learning. Finally, we show that our approach boosts performance on a common benchmark domain that uses anytime weighted A* to solve a range of heuristic search problems and a mobile robot application that uses RRT* to solve motion planning problems.

show abstract

“…The most related work to ours is by Hansen and Zilberstein [11], where they proposed a dynamic-programming approach to solve the model-based variant of meta-reasoning. They later developed online approaches, such as online performance prediction [10] and RL-based model-free metareasoning [12] in an attempt to remove the necessity of preprocessing and gathering data. They particularly considered the solution quality to be safety, yielding smooth performance profiles.…”

Section: Related Workmentioning

confidence: 99%

“…We also investigate model-free approaches that use the function-approximation capabilities of neural networks to mitigate the curse of dimensionality. As meta-reasoning is a control problem, approximate dynamic programming or reinforcement learning (RL) can be adopted to learn a policy [12]. However, we observe that in our problem, we have access to an oracle for the optimal decision policy for each performance profile in the dataset, simply by letting the motion planner run long enough so that we get diminishing returns and determining, post hoc, the optimal stopping time for each training example.…”

Section: Introductionmentioning

confidence: 99%

Learning When to Quit: Meta-Reasoning for Motion Planning

Sung¹,

Kaelbling²,

Lozano-Pérez³

2021

Preprint

View full text Add to dashboard Cite

Anytime motion planners are widely used in robotics. However, the relationship between their solution quality and computation time is not well understood, and thus, determining when to quit planning and start execution is unclear. In this paper, we address the problem of deciding when to stop deliberation under bounded computational capacity, so called meta-reasoning, for anytime motion planning. We propose data-driven learning methods, model-based and model-free meta-reasoning, that are applicable to different environment distributions and agnostic to the choice of anytime motion planners. As a part of the framework, we design a convolutional neural network-based optimal solution predictor that predicts the optimal path length from a given 2D workspace image. We empirically evaluate the performance of the proposed methods in simulation in comparison with baselines.

show abstract

A Model-Free Approach to Meta-Level Control of Anytime Algorithms

Cited by 10 publications

References 29 publications

On the Benefits of Randomly Adjusting Anytime Weighted A*

On the Benefits of Randomly Adjusting Anytime Weighted A*

Tuning the Hyperparameters of Anytime Planning: A Metareasoning Approach with Deep Reinforcement Learning

Learning When to Quit: Meta-Reasoning for Motion Planning

Contact Info

Product

Resources

About