Learning Probably Approximately Complete and Safe Action Models for Stochastic Worlds

Juba, Brendan; Stern, Roni

doi:10.1609/aaai.v36i9.21215

Cited by 4 publications

(7 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…FAMA works even if the observations given to it are partially observable. Algorithms from the SAM learning family (Stern and Juba 2017;Juba, Le, and Stern 2021;Juba and Stern 2022;Mordoch et al 2022) are different from other action model learning algorithms in that they guarantee that the action model they return is safe, in the sense that plans consistent with it are also consistent with the real, unknown action model. Most algorithms from this family have a tractable running time and reasonable sample complexity to ensure a probabilistic form of completeness, but rely on perfect observability of the given observations.…”

Section: Background and Problem Definitionmentioning

confidence: 99%

“…Plans generated with the learned model may not be executable or may fail to achieve their intended goals. SAM Learning (Stern and Juba 2017;Juba, Le, and Stern 2021;Juba and Stern 2022;Mordoch et al 2022) is a recently introduced family of learning algorithms that provide safety guarantees over the learned PDDL model: any plan generated with the model they return is guaranteed to be executable and achieve the intended goals. SAM Learning, however, is limited to learning from fully observed trajectories.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Learning Safe Action Models with Partial Observability

Le,

Juba,

Stern

2024

AAAI

View full text Add to dashboard Cite

A common approach for solving planning problems is to model them in a formal language such as the Planning Domain Definition Language (PDDL), and then use an appropriate PDDL planner. Several algorithms for learning PDDL models from observations have been proposed but plans created with these learned models may not be sound. We propose two algorithms for learning PDDL models that are guaranteed to be safe to use even when given observations that include partially observable states. We analyze these algorithms theoretically, characterizing the sample complexity each algorithm requires to guarantee probabilistic completeness. We also show experimentally that our algorithms are often better than FAMA, a state-of-the-art PDDL learning algorithm.

show abstract

Section: Background and Problem Definitionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Learning Safe Action Models with Partial Observability

Le,

Juba,

Stern

2024

AAAI

View full text Add to dashboard Cite

show abstract

“…The algorithms presented above learn action models that do not guarantee that the actions learned are applicable according to the agent's actual action model definition. Contrary to these algorithms, the SAM family of algorithms is designed to learn action models in a setting where execution failures must be avoided (Stern and Juba 2017;Juba, Le, and Stern 2021;Juba and Stern 2022). To this end, SAM generates a conservative action model.…”

Section: Related Workmentioning

confidence: 99%

“…We focus on such cases and aim to learn an action model that satisfies the strongest form of soundness: every plan generated using the learned model must be applicable and yield the same states as an unknown, accurate model. An action model that satisfies this requirement has been called safe (Juba, Le, and Stern 2021;Juba and Stern 2022;Mordoch, Stern, and Juba 2023). 1 We view this as a "safety" notion in part since it enables more conventional notions of safety to be enforced during planning, and provides assurance that they will carry over to the actual execution.…”

Section: Introductionmentioning

confidence: 99%

“…Algorithms from the Safe Action Model Learning (SAM) family (Stern and Juba 2017;Juba, Le, and Stern 2021;Juba and Stern 2022;Mordoch, Stern, and Juba 2023) address the challenge of learning safe action models under different sets of assumptions. However, these algorithms are not suitable for learning actions that may include conditional effects.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Safe Learning of PDDL Domains with Conditional Effects

Mordoch,

Scala,

Stern

et al. 2024

ICAPS

View full text Add to dashboard Cite

Powerful domain-independent planners have been developed to solve various types of planning problems. These planners often require a model of the acting agent's actions, given in some planning domain description language. Manually designing such an action model is a notoriously challenging task. An alternative is to automatically learn action models from observation. Such an action model is called safe if every plan created with it is consistent with the real, unknown action model. Algorithms for learning such safe action models exist, yet they cannot handle domains with conditional or universal effects, which are common constructs in many planning problems. We prove that learning non-trivial safe action models with conditional effects may require an exponential number of samples. Then, we identify reasonable assumptions under which such learning is tractable and propose Conditional-SAM, the first algorithm capable of doing so. We analyze Conditional-SAM theoretically and evaluate it experimentally. Our results show that the action models learned by Conditional-SAM can be used to solve perfectly most of the test set problems in most of the experimented domains.

show abstract

Learning Safe Numeric Action Models

Mordoch

Juba

Stern

2023

AAAI

View full text Add to dashboard Cite

Powerful domain-independent planners have been developed to solve various types of planning problems. These planners often require a model of the acting agent's actions, given in some planning domain description language. Yet obtaining such an action model is a notoriously hard task. This task is even more challenging in mission-critical domains, where a trial-and-error approach to learning how to act is not an option. In such domains, the action model used to generate plans must be safe, in the sense that plans generated with it must be applicable and achieve their goals. Learning safe action models for planning has been recently explored for domains in which states are sufficiently described with Boolean variables. In this work, we go beyond this limitation and propose the NSAM algorithm. NSAM runs in time that is polynomial in the number of observations and, under certain conditions, is guaranteed to return safe action models. We analyze its worst-case sample complexity, which may be intractable for some domains. Empirically, however, NSAM can quickly learn a safe action model that can solve most problems in the domain.

show abstract

Learning Probably Approximately Complete and Safe Action Models for Stochastic Worlds

Cited by 4 publications

References 21 publications

Learning Safe Action Models with Partial Observability

Learning Safe Action Models with Partial Observability

Safe Learning of PDDL Domains with Conditional Effects

Learning Safe Numeric Action Models

Contact Info

Product

Resources

About