Learning action models with minimal observability

Aineto, Diego; Celorrio, Sergio Jiménez; Onaindía, Eva

doi:10.1016/j.artint.2019.05.003

Cited by 43 publications

(62 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We halted this process when the learned action model was equivalent to the real model, and report the number of triplets and trajectories given to the algorithm. As a baseline, we performed this experiment also with FAMA (Aineto, Celorrio, and Onaindia 2019), which is a modern algorithm for learning action models from trajectories. Note that unlike SAM Learning, FAMA has no safety guarantee.…”

Section: Methodsmentioning

confidence: 99%

“…It constructs a graphical model and learns the statistical relationship between actions and possible state transitions. FAMA (Aineto, Celorrio, and Onaindia 2019) compiles the problem of finding an action model that is consistent with a set of trajectories to a planning problem. The solution to this planning problem is a sequence of "actions" that construct an action model.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Safe Learning of Lifted Action Models

Juba

Stern

2021

Proceedings of the Eighteenth International Conference on Principles of Knowledge Representation and Reasoning

View full text Add to dashboard Cite

Creating a domain model, even for classical, domain-independent planning, is a notoriously hard knowledge-engineering task. A natural approach to solve this problem is to learn a domain model from observations. However, model learning approaches frequently do not provide safety guarantees: the learned model may assume actions are applicable when they are not, and may incorrectly capture actions' effects. This may result in generating plans that will fail when executed. In some domains such failures are not acceptable, due to the cost of failure or inability to replan online after failure. In such settings, all learning must be done offline, based on some observations collected, e.g., by some other agents or a human. Through this learning, the task is to generate a plan that is guaranteed to be successful. This is called the model-free planning problem. Prior work proposed an algorithm for solving the model-free planning problem in classical planning. However, they were limited to learning grounded domains, and thus they could not scale. We generalize this prior work and propose the first safe model-free planning algorithm for lifted domains. We prove the correctness of our approach, and provide a statistical analysis showing that the number of trajectories needed to solve future problems with high probability is linear in the potential size of the domain model. We also present experiments on twelve IPC domains showing that our approach is able to learn the real action model in all cases with at most two trajectories.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Safe Learning of Lifted Action Models

Juba

Stern

2021

Proceedings of the Eighteenth International Conference on Principles of Knowledge Representation and Reasoning

View full text Add to dashboard Cite

show abstract

“…• Obs (comp(s)) = Obs(s). 1 The compatibility domain D induced by a domain D is the image of D under the compatibility mapping comp. It hence encodes the original transition systems as seen through the lens of the observation function.…”

Section: Compatibility Domainmentioning

confidence: 99%

“…We discuss those works that are most directly related to the results of this paper. In recent years, several works have appeared that can learn action descriptions in partially observable environments [51,3,56,55,57,41,58,39,19,38,1]. In these works, partial observability is induced by selecting at random n < |P | propositional symbols to observe, for each state in the learning input.…”

Section: Related Workmentioning

confidence: 99%

Learning to Act and Observe in Partially Observable Domains

Bolander,

Gierasimczuk,

Liberman

2021

Preprint

View full text Add to dashboard Cite

We consider a learning agent in a partially observable environment, with which the agent has never interacted before, and about which it learns both what it can observe and how its actions affect the environment. The agent can learn about this domain from experience gathered by taking actions in the domain and observing their results. We present learning algorithms capable of learning as much as possible (in a well-defined sense) both about what is directly observable and about what actions do in the domain, given the learner's observational constraints. We differentiate the level of domain knowledge attained by each algorithm, and characterize the type of observations required to reach it. The algorithms use dynamic epistemic logic (DEL) to represent the learned domain information symbolically. Our work continues that of Bolander and Gierasimczuk (2015), which developed DEL-based learning algorithms based to learn domain information in fully observable domains.

show abstract

“…Our work is part of the growing literature on learning action models for domain-independent planning (Arora et al 2018), which includes algorithms such as ARMS (Yang, Wu, and Jiang 2007), LOCM (Cresswell, McCluskey, and West 2013), LOCM2 (Cresswell and Gregory 2011), AMAN (Zhuo and Kambhampati 2013), and FAMA (Aineto, Celorrio, and Onaindia 2019). Similar to SAM learning, ARMS (Yang, Wu, and Jiang 2007) also defines rules to infer an action model from a given set of trajectories.…”

Section: Related Workmentioning

confidence: 99%

Safe Learning of Lifted Action Models

Juba¹,

Le²,

Stern³

2021

Preprint

View full text Add to dashboard Cite

Creating a domain model, even for classical, domainindependent planning, is a notoriously hard knowledgeengineering task. A natural approach to solve this problem is to learn a domain model from observations. However, model learning approaches frequently do not provide safety guarantees: the learned model may assume actions are applicable when they are not, and may incorrectly capture actions' effects. This may result in generating plans that will fail when executed. In some domains such failures are not acceptable, due to the cost of failure or inability to replan online after failure. In such settings, all learning must be done offline, based on some observations collected, e.g., by some other agents or a human. Through this learning, the task is to generate a plan that is guaranteed to be successful. This is called the modelfree planning problem. Prior work proposed an algorithm for solving the model-free planning problem in classical planning. However, they were limited to learning grounded domains, and thus they could not scale. We generalize this prior work and propose the first safe model-free planning algorithm for lifted domains. We prove the correctness of our approach, and provide a statistical analysis showing that the number of trajectories needed to solve future problems with high probability is linear in the potential size of the domain model. We also present experiments on twelve IPC domains showing that our approach is able to learn the real action model in all cases with at most two trajectories. * This Arxiv paper is an extended version of a paper with the same title that have been accepted to the International Conference on Principles of Knowledge Representation and Reasoning (KR), 2021.

show abstract

Learning action models with minimal observability

Cited by 43 publications

References 25 publications

Safe Learning of Lifted Action Models

Safe Learning of Lifted Action Models

Learning to Act and Observe in Partially Observable Domains

Safe Learning of Lifted Action Models

Contact Info

Product

Resources

About