Learning General Planning Policies from Small Examples Without Supervision

Francès, Guillem; Bonet, Blai; Geffner, Héctor

doi:10.1609/aaai.v35i13.17402

Cited by 17 publications

(28 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The second that when the gripper is not empty, any action that makes H false and does not affect n should be selected. It has been shown that general policies of this form can be learned without supervision by solving a Max-Weighted SAT theory T (S, F ) where S is a set of sampled state transitions, and F is a large but finite pool of Boolean and numerical features obtained from the domain predicates (Francès, Bonet, and Geffner 2021).…”

Section: General Policies and Value Functionsmentioning

confidence: 99%

“…General Policies. The problem of learning general policies has been addressed using combinatorial approaches where the symbolic domains are given (Khardon 1999;Martín and Geffner 2004;Bonet, Francès, and Geffner 2019;Francès, Bonet, and Geffner 2021), DL approaches where the domains are given too (Toyer et al 2020;Garg, Bajpai, and Mausam 2020), and DRL approaches that do not make use of prior knowledge about the structure of either domains or states (Groshev et al 2018;Chevalier-Boisvert et al 2019;Campero et al 2021). This work is a step to bring the first two approaches together along with their potential benefits.…”

Section: Related Workmentioning

confidence: 99%

“…Even in simple tasks, such as retrieving a key to open a door in a simple environment, they may require a large number of simulations, and even then, they may fail to generalize to all possible situations (Chevalier-Boisvert et al 2019). Interestingly, the computation of general policies has been addressed recently in a model-based setting that assumes that a general model of the actions is known in terms of action schemas and predicates (Bonet and Geffner 2018;Francès, Bonet, and Geffner 2021). This paper is a step aimed at bringing these threads together with two motivations: to replace the combinatorial methods that have been proposed to learn general policies by more robust and scalable deep learning methods, and to do so in a principled manner where the intermediate representations and experimental results, both positive and negative, can be understood.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Learning General Optimal Policies with Graph Neural Networks: Expressive Power, Transparency, and Limits

Ståhlberg

Bonet²,

Geffner

2022

ICAPS

Self Cite

View full text Add to dashboard Cite

It has been recently shown that general policies for many classical planning domains can be expressed and learned in terms of a pool of features defined from the domain predicates using a description logic grammar. At the same time, most description logics correspond to a fragment of k-variable counting logic (C_k) for k=2, that has been shown to provide a tight characterization of the expressive power of graph neural networks. In this work, we make use of these results to understand the power and limits of using graph neural networks (GNNs) for learning optimal general policies over a number of tractable planning domains where such policies are known to exist. For this, we train a simple GNN in a supervised manner to approximate the optimal value function V*(s) of a number of sample states s. As predicted by the theory, it is observed that general optimal policies are obtained in domains where general optimal value functions can be defined with C_2 features but not in those requiring more expressive C_3 features. In addition, it is observed that the features learned are in close correspondence with the features needed to express V* in closed form. The theory and the analysis of the domains let us understand the features that are actually learned as well as those that cannot be learned in this way, and let us move in a principled manner from a combinatorial optimization approach to learning general policies to a potentially, more robust and scalable approach based on deep learning.

show abstract

Section: General Policies and Value Functionsmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Learning General Optimal Policies with Graph Neural Networks: Expressive Power, Transparency, and Limits

Ståhlberg

Bonet²,

Geffner

2022

ICAPS

Self Cite

View full text Add to dashboard Cite

show abstract

“…We turn to the problem of learning sketches given a set of instances P of the target class of problems Q and the desired bound k on sketch width. We roughly follow the approach for learning general policies (Bonet, Francès, and Geffner 2019;Francès, Bonet, and Geffner 2021) by constructing a theory T k,m (P, F ) from P, k, a bound m on the number of sketch rules, and a finite pool of features F obtained from the domain predicates and a fixed grammar.…”

Section: Learning Sketches: Formulationmentioning

confidence: 99%

“…The language of sketches is powerful, as sketches can encode everything from simple goal serializations to full general policies. Indeed, the language of general policies is the language of sketches but with a slightly different semantics where the subgoal states s to be reached from a state s are restricted to be one step away from s (Bonet and Geffner 2018;Francès, Bonet, and Geffner 2021). More interestingly, sketches can split problems into subproblems of bounded width (Lipovetzky and Geffner 2012;Lipovetzky 2021) which can then be solved greedily, in polynomial time, by a variant of the SIW algorithm, called SIW R (Bonet and Geffner 2021).…”

Section: Introductionmentioning

confidence: 99%

Learning Sketches for Decomposing Planning Problems into Subproblems of Bounded Width

Drexler

Seipp

Geffner³

2022

ICAPS

Self Cite

View full text Add to dashboard Cite

Recently, sketches have been introduced as a general language for representing the subgoal structure of instances drawn from the same domain. Sketches are collections of rules of the form C -> E over a given set of features where C expresses Boolean conditions and E expresses qualitative changes. Each sketch rule defines a subproblem: going from a state that satisfies C to a state that achieves the change expressed by E or a goal state. Sketches can encode simple goal serializations, general policies, or decompositions of bounded width that can be solved greedily, in polynomial time, by the SIW_R variant of the SIW algorithm. Previous work has shown the computational value of sketches over benchmark domains that, while tractable, are challenging for domain-independent planners. In this work, we address the problem of learning sketches automatically given a planning domain, some instances of the target class of problems, and the desired bound on the sketch width. We present a logical formulation of the problem, an implementation using the ASP solver Clingo, and experimental results. The sketch learner and the SIW_R planner yield a domain-independent planner that learns and exploits domain structure in a crisp and explicit form.

show abstract

Generalized planning as heuristic search: A new planning search-space that leverages pointers over objects

Segovia-Aguas,

Jiménez,

Jonsson

2024

Artificial Intelligence

View full text Add to dashboard Cite

Learning General Planning Policies from Small Examples Without Supervision

Cited by 17 publications

References 16 publications

Learning General Optimal Policies with Graph Neural Networks: Expressive Power, Transparency, and Limits

Learning General Optimal Policies with Graph Neural Networks: Expressive Power, Transparency, and Limits

Learning Sketches for Decomposing Planning Problems into Subproblems of Bounded Width

Generalized planning as heuristic search: A new planning search-space that leverages pointers over objects

Contact Info

Product

Resources

About